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The New Oracle Threat 


pam its long-running campaign to ac- 
quire other technology companies, Oracle 
recently announced the pending acquisition of 
Sun Microsystems. Unlike the aborted IBM-Sun 
merger, the Sun acquisition makes a lot of sense. 
Oracle has been a major leader in the technology 
marketplace, and the Sun acquisition strengthens 
Oracle in a number of fundamental ways. Notably, 
the addition of Sun puts Oracle in the hardware 
business for the first time. 


What Sun Brings to Oracle 

Oracle will now have technology that covers the en- 
tire breadth of IT technology including: 

e Hardware—Oracle will now have Sun’s SPARC 
server line and Sun’s well-received x64 line of 
servers; plus it now has Sun’s Unified Storage 
System and their StorageTek tape libraries. 
Operating systems—Oracle has offered their Red 
Hat-based Oracle Enterprise Linux OS for some 
time. Now they will also have Sun’s Solaris OS. 
Database—Oracle 11g and the Oracle Real Ap- 
plication Cluster (RAC) are the core of Oracle’s 
business. The Sun purchase adds the open source 
MySQL database. 

Language—Apart from PL/SQL, Oracle’s SQL 
variant, Oracle has never been in the language 
business. Sun brings Oracle the popular Java 
language to add to their product suite, along 
with Sun’s NetBeans IDE. Java might very well 
be Sun’s most important asset. 

Enterprise Application Software—Oracle’s 
previous acquisitions such as PeopleSoft, Siebel, 
and JD Edwards helped put the company on 

the map in this space. In spite of Java, Sun has 
never been in the enterprise application software 
market. 

Office applications—Oracle didn’t have a pres- 
ence here before. Sun’s multi-platform StarOffice 
is one of the primary competitors to Microsoft’s 
ubiquitous Office Suite. 
Virtualization—Although its adoption is small, 
Oracle has its own Xen-based Oracle VM virtu- 
alization platform. Now Oracle will have Sun’s 
promising VirtualBox virtualization software. 
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The Oracle vs. Microsoft 
Competition 

So what does Oracle’s Sun acquisition mean to 
Microsoft and SQL Server IT professionals? It de- 
pends on how Oracle uses its new acquisition. One 
thing seems certain—the Sun acquisition will com- 
pletely polarize Oracle and Microsoft. They have 
been competitors ever since Microsoft acquired 
SQL Server from Sybase, but this move will put 
the majority of non-Microsoft IT technology in 
Oracle’s software portfolio. The Oracle camp will 
provide Oracle databases, Java, and Linux. Mi- 
crosoft has Windows, SQL Server, and the .NET 
Framework. Businesses will tend to fall into either 
one camp or the other. 

The acquisition of Sun will make Oracle even 
more formidable because it will be able to offer an 
end-to-end solution that should appeal to enter- 
prises. Oracle can now deliver a turnkey hardware, 
OS, database, and development solution. While I 
doubt this will influence existing SQL Server cus- 
tomers, it definitely will enable Oracle to be very 
competitive when organizations seek new solu- 
tions or look to upgrade, especially at the enter- 
prise level. 


What Will Happen to MySQL? 

The addition of MySQL and StarOffice could 
enable Oracle to be formidable at the low end as 
well. Although the pesky MySQL database does 
compete with other Oracle offerings and SQL 
Server at the low end of the market, I don’t 
think Oracle will pull the plug on MySQL. 
Oracle doesn’t often kill off the technolo- 
gies it buys, and MySQL’s widespread N | 
adoption by ISPs should ensure its conti- 
nuity—even if it does compete with Oracle’s 
flagship database. 

SQL Server’s lead in usability, business intel- 
ligence (BI), and pricing should allow it to con- 
tinue to gain market share. However, the combina- 
tion of Oracle and Sun will make Oracle an even 
tougher competitor—especially at the high end of 
the market. [SQL] 
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Seven Tips for SQL Server 


Production DBAs 


Bo: SQL Server has been around for quite 
a while, there are many tools and techniques 
available to help you use it—so many that you might 
not even be aware of them. Here are seven tips on 
how you might use some of the available tools and 
techniques to make your job as a SQL Server produc- 
tion DBA easier. 

1. Use the Forfiles utility to delete old database 
backup files. The Forfiles utility (forfiles.exe) comes 
installed with Windows Server 2003 and later. 
Forfiles enables batch processing of files. In conjunc- 
tion with a SQL Server Agent job, you can use this 
utility to delete old database backup files to remove 
dependencies on items such as SQL Server mainte- 
nance plans, the xp_cmdshell extended stored proce- 
dure, or VBScript objects. For example, the following 
command uses Forfiles to delete all .bak files that are 
two days old or older in the E:\sqlbackup folder: 


forfiles /p "e:\sqlbackup" /m "*.bak" 
/c "cmd /c del /Q @path" /d -2 


Note that you would enter this command all on one line 
at the command prompt. For more information about 
this utility, see Microsoft’s Forfiles web page at technet2 
-microsoft.com/WindowsServer/en/Library/9660fea1- 
65c7-48cf-b466-204ba 15938 1e1033.mspx. 

2. Use ALTER USER WITH LOGIN to repair 
orphaned logins. Starting with SQL Server 2005 SP2, 
T-SQL’s ALTER USER command includes the 
WITH LOGIN clause. This clause repairs orphaned 
logins by changing the database user’s SID to the 
server login’s security identifier. It repairs both 
Windows and SQL Server logins. Orphaned users 
are created, for example, when a database is restored 
from another server and the login was independently 
created (not copied from the other server). For more 
information about ALTER USER WITH LOGIN, 
see Laurentiu Cristofor’s blog “SQL Server 2005: 
Some new security features in SP2” at blogs.msdn 
.com/Icris/archive/2007/02/19/sql-server-2005-some- 
new-security-features-in-sp2.aspx and the ALTER 
USER web page in SQL Server Books Online (BOL) 
at msdn.microsoft.com/en-us/library/ms176060 
.ASPX. 

3. Use sp_addsrvrolemember to give yourself the 
sysadmin role. In SQL Server 2005, the Windows 
built-in Administrators group isn’t given the sysadmin 
role by default. So, what happens if you inherit a 
SQL Server instance and you need to give yourself 
the sysadmin role? As a Windows Administrator, you 
can start the SQL Server instance in single-user mode 
(aka maintenance mode), then run the sp_addsrv- 


rolemember system stored procedure in the Sqlemd 
utility to add your Windows login to the sysadmin 
role. For more information about this technique, see 
Raul Garcia’s blog “Disaster Recovery: What to do 
when the SA account password is lost in SQL Server 
2005” (blogs.msdn.com/raulga/archive/2007/07/12/ 
disaster-recovery-what-to-do-when-the-sa-account- 
password-is-lost-in-sql-server-2005.aspx). 

4. Use the PortQryUI utility to troubleshoot con- 
nectivity problems. To troubleshoot TCP/IP connec- 
tivity problems, you can use Microsoft’s PortQryUI 
utility. PortQryUI is the same as the venerable 
PortQry, except PortQryUI includes a GUI and some 
predefined services. A predefined service is a group 
of ports to scan. One of the predefined groups of 
ports is for SQL Server, which consists of UDP port 
1434 and TCP port 1433. So, to check these ports, 
you simply enter the IP address or Fully Qualified 
Domain Name (FQDN) of the target SQL Server 
instance. Then, in the Service to Query drop-down 
box, select SQL Service and click the Query button. 
If the ports are OK, the utility will tell you that 
the ports are Listening. Otherwise, it will tell you 
the ports are being Filtered or are Not Listening. 
You can download the PortQryUI utility from the 
“PortQryUI - User Interface for the PortQry Com- 
mand Line Port Scanner” web page (www.microsoft 
.com/downloads/details.aspx? FamilyID=8355e537- 
1ea6-4569-aabb-f248f4bd9 1 d0&displaylang=en). 

5. Use a different strategy when running DBCC 
CHECKDB against large databases. As databases 
become larger, maintenance procedures such as 
checking database integrity with T-SQL’s DBCC 
CHECKDB command take a lot longer. What 
can you do when DBCC CHECKDB exceeds 
the allocated maintenance window period? Paul 
S. Randal offers several potential solutions in his 
blog “CHECKDB From Every Angle: Consis- 
tency Checking Options for a VLDB” (sqlskills 
.com/BLOGS/PAUL/post/CHECK DB-From-Every- 
Angle-Consistency-Checking-Options-for-a-VLDB 
.aspx). One solution is to use a backup to restore 
the database on a separate server and run DBCC 
CHECKDB against that server. Another solu- 
tion is to set the database’s page verify option to 
CHECKSUM and run DBCC CHECKDB with the 
PHYSICAL_ONLY option. With this combination, 
DBCC CHECKDB runs in significantly less time, yet 
it still catches I/O subsystem and page corruption. 

6. Use the Import Package Option to deploy SQL 
Server Integration Services (SSIS) packages to the 
msdb database. SSIS has several deployment options. 
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ORE on the WEB 


Download the code at 
InstantDoc ID 102103. 


A simple method to deploy SSIS packages to SQL 
Server’s msdb database is as follows. In SQL Server 
Management Studio (SSMS), connect to Integra- 
tion Services. Under Stored Packages, right-click 
the MSDB folder and select Import Package. In the 
dialog box, specify the SSIS package’s current location 
and name. Set the Protection Level to Rely on server 
storage and roles for access control, then click OK. 
Manually copy any XML config file related to the 
SSIS package to the target server. If the package uses 
SQL Server configurations, the package connection 
string to the SQL Server configuration might need to 
be updated with the correct server name before you 
import it. For more information, see the “How to: 
Import a Package Using Integration Services Service” 
web page in SQL Server BOL (msdn.microsoft.com/ 
en-us/library/ms141235(SQL.90).aspx). 

7. Use SQL Server 2008's local server groups and 
Central Management Servers to query multiple servers 
at the same time. In SQL Server 2008, you can use a 
local server group to quickly connect to servers that 
you frequently manage. You can also use a Central 


Management Server to store servers’ registration 
metadata so that team members can use the same reg- 
istrations. Both can contain SQL Server 2008, SQL 
Server 2005, and SQL Server 2000 servers. In SSMS, 
you can query the servers in either a local server 
group or Central Management Server and combine 
the servers’ results. To do so, go SSMS’s Registered 
Servers window, right-click the local server group or 
Central Management Server, and select New Query. 
In the Query Editor, type the T-SQL command you 
want to run and click Execute. You'll get one result 
set with two columns. The first column contains 
each server’s name, and the command output from 
that server appears in the second column. For more 
information, see the “How to: Execute Statements 
Against Multiple Servers Simultaneously (SQL Server 
Management Studio)” web page in SQL Server BOL 
(technet.microsoft.com/en-us/library/bb964743.aspx). 


—Louis Nguyen, DBA, Centex 
InstantDoc ID 102136 


Improving the Index_Evaluation_USP 


Stored Procedure 


Í the Reader to Reader article “Evaluate 
n Index Usage in Databases” (October 2008, 
InstantDoc ID 99985), Shaunt Khaldtiance provided 
a stored procedure, Index_Evaluation_USP, that you 
can use to identify tables without indexes, indexes 
that aren’t being used, and indexes that aren’t being 
used efficiently. I made several improvements to that 
stored procedure. 

The original version of Index_Evaluation USP 
includes the code 


IF EXISTS (SELECT 1 FROM 
tempdb.sys.objects WHERE NAME LIKE 
'%indexmap%' ) BEGIN DROP TABLE 
#indexmap END 


This code works well if the 
#indexmap temporary table belongs 


IS NOT NULL BEGIN DROP TABLE 
#indexmap END 


The second change I made was to the inner 
SELECT statement in Listing 1. This SELECT 
statement reads from the dynamic management view 
(DMV) named sys.dm_db_index_usage_stats. This 
DMV contains one row for every index that was 
accessed since the SQL Server instance was restarted 
or a database was opened, so you can use it to obtain 
index usage information. However, sys.dm_db_index_ 
usage_stats includes data from all the databases in an 
instance. I needed the ability to analyze one database 
at a time, so I added AND i.database_ID=db_id() 
to the ON clause of the INNER JOIN statement, as 
callout A in Listing 1 shows. 


LISTING l: Modified Inner SELECT 


to the same session. However, if Statement 
there’s an #indexmap temporary  sELECT 
table created by another session, the condition i.object_id, 


will evaluate to TRUE but the DROP TABLE 
statement will fail because the current session 
has no such temporary table. To work under 
any circumstance, I replaced that code with 
the following code 


object_name(i.object_id) AS Table_Name, i.index_id, 
SUMCi.user_seeks) AS seeks, 

SUM(i.user_scans) AS scans, 

SUMCi.user_lookups) AS lookups 

FROM sys.tables t 

INNER JOIN sys.dm_db_index_usage_stats i 


(A) ON t.object_id = i.object_id AND i.database_ID=db_id(Q) 


GROUP BY i.object_id, i.index_id 


IF object_id('tempdb. .#indexmap') 
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Although sys.dm_db_index_usage_stats pro- 
vides dynamic information about index usage, it 
doesn’t contain all the information needed. Thus, 
the Index_Evaluation_USP stored procedure uses 


LISTING 2: New Code that Looks for Tables Without 
Indexes 


DECLARE @table_with_no_indexes int 
SELECT @table_with_no_indexes=count(*) FROM sys.tables t 


the sys.indexes static view to get information that 
isn’t exposed by sys.dm_db_index_usage_stats. 

In sys.indexes, a table will have a row for 
each index and a row representing the table itself 
(aka heap). In the index_id column, a value of 1 
indicates a clustered index, a value greater than 1 
indicates a nonclustered index, and a value of 0 
indicates the table itself. For example, a table might 
have a row showing index_id=0 (for the table itself) 
and two more rows showing index_id>1 (for the 
nonclustered indexes). Even when a table doesn’t 
have any indexes, sys.indexes will still contain the 
row representing the table itself, so you can tell if 
a table doesn’t have any indexes if the only row for 
that table is the one in which index_id=0. I added 
code to Index_Evaluation_USP that looks for this 
condition (see Listing 2). As in the original stored 
procedure, the tables with no indexes are further 
analyzed to see how many times they were accessed 
with a table scan. If the tables were scanned, 
they're good candidates to get indexes. 

The last improvement I made to the stored 
procedure was to add an input parameter— 
@i_DbName—in case I wanted to run it against 
a particular database instead of running it against 
all user databases on the SQL Server instance. 

You can download the modified stored proce- 
dure—dbo.Index_Evaluation_USP—by going to 
the SOL Server Magazine website (www.sqlmag 
.com), entering 102103 in the InstantDoc ID text 
box, and clicking the 102103.zip hotlink. You 
can place and deploy the stored procedure in any 
database. To analyze a single database, you can 
use code such as 


DECLARE @rc int 
EXECUTE @rc=dbo.Index_Evaluation_USP 
'DB' 


where DB is a valid database name. To analyze all 
the databases in the current SQL Server instance, 
you don’t include a parameter, using code such as 


DECLARE @rc int 
EXECUTE @rc=dbo.Index_Evaluation_USP 


The dbo.Index_Evaluation_USP stored procedure 
works on SQL Server 2008 and SQL Server 2005. 
SOU 

—Gabriela Nanau, DBA, 

Cornerstone Group of Companies 

InstantDoc ID 102103 
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WHERE NOT EXISTS (SELECT 1 FROM sys.indexes i 
WHERE i.object_id=t.object_id AND index_id>=1) 
PRINT ''Database ['+@database_name+'] ------------------------------------ 
Total number of tables without any index: 
+CONVERT (varchar(19) ,@table_with_no_indexes) 
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YourSQLDba 


Maintain your databases using this T-SQL 


stored procedure 


if your organization has a policy against 
database maintenance plans or would like to 
move away from them before they’re deprecated, 
take a look at YourSQLDba. It’s a T-SQL alterna- 
tive to database maintenance plans. Because Your- 
SQLDba is based on T-SQL, it’s much easier to 
implement than SQL Server Integration Services 
(SSIS) packages used for database maintenance. It 
also provides better logging than database mainte- 
nance packages and includes the ability to dynami- 
cally select databases for maintenance. 
YourSQLDba was written by Maurice Pel- 
chat, of the Societe GRICS in Canada, to make 
the deployment, maintenance, and troubleshoot- 
ing of routine database maintenance activities on 
SQL Server machines easier. Let’s look at the tasks 
YourSQLDba can be configured to do. 


What YourSQLDba Can Do 
YourSQLDba, which is a multi-faceted stored pro- 
cedure called YourSQLDba_DoMaint, includes 
features that are useful for automating back- 
ups, consistency checks, and other preventative 
maintenance tasks. Its default configuration uses 
two SQL Server Agent tasks, YourSQLDba_ 
LogBackups and YourSQLDba_FullBackups_ 
and_Maintenance, to back up all transaction logs 
every 15 minutes and compress oversized log files. 
The default configuration also cleans up server 
and agent logs, checks database integrity, updates 
distribution statistics, rebuilds and reorganizes in- 
dexes, and performs a full backup of each database. 
It concludes its default operations by backing up 
MSDB so that you can easily keep track of all data- 
base backups and SQL Server Agent job histories. 
In addition, YourSQLDba can be configured to 
* automate full database backups and recurring 
transaction log backups 
* maintain detailed logs containing information 
about its actions and any errors encountered 
e update statistics at regular intervals (defaults to 
once per week) 
* reorganize or rebuild indexes, choosing the best 
defragmentation option at the time of execution 
e check database integrity 
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e keep an up-to-date backup of MSDB, which 
records all backup history at the end of each full 
or log backup cycle 

* clean up the various history logs in SQL Server, 
including the backup log, SQL Server Agent, 
and SQL Server maintenance job histories 

e recycle the SQL Server error log each day 

* connect network drives to the database engine 
for backup purposes and make them automati- 
cally reconnect at start-up 

* provide simplified system stored procedures to 
perform manual backups, duplicate databases, 
and database restores 

e schedule agent tasks for maintenance 

* configure Database Mail for maintenance 
reporting 

* send email notifications regarding maintenance 
activity 

e send email notifications when especially prob- 
lematic maintenance issues arise 

e disable databases with integrity problems 
immediately 

e disable databases that have failed more than 
three consecutive backups. (The counter is reset 
after a manual backup.) 


Because YourSQLDba is a stored procedure, 
you must compile it on every SQL Server instance 
in which you want to deploy it. When deployed, it 
creates a small database called YourSQLDba, as 
well as several small objects. Also, YoursQLDba 
is smart in upgrade situations, replacing all objects 
that have changed since the last time you deployed 
the script. It creates a single startup stored proce- 
dure called CreateNetworkDrive in the Master da- 
tabase. It also creates a database mail profile and 
SQL Server Agent operator. 


System Requirements 
YourSQLDba uses simple T-SQL to perform 
database maintenance tasks and makes calls to 
xp_cmdshell and sqlcmd.exe. YourSQLDba runs 
on any OS that supports SQL Server 2008 or 2005. 
[SQL 
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riven in no small part by a flagging econo- 

my, virtualization has taken the IT industry 

by storm. Virtualization has enabled busi- 
nesses to sidestep traditional OS and hardware limita- 
tions by allowing multiple servers to run together on a 
single hardware platform. For organizations running 
Windows Server 2008, Hyper-V is an especially com- 
pelling option as Hyper-V is included as an installable 
role. Let’s look at some of the driving forces that are 
motivating businesses to embrace virtualization— 
even for holdout installations such as SQL Server— 
and what you need to know should you decide to run 
SQL Server on Hyper-V. 


Driving Forces Behind 
Virtualization 

One of the most common reasons why organizations 
have taken advantage of virtualization is to consoli- 
date servers. (Although you can use SQL Server’s 
multiple instances feature to consolidate SQL Server 
systems, it isn’t widely used for that purpose. For more 
information about this feature, see the “Virtualization 
vs. Multiple Instances” sidebar.) Organizations can 
use server consolidation to combine the workloads 
of multiple physical servers onto a single powerful 
server, where each workload runs in a separate virtual 
machine. This increases server utilization and ROI, 
which is especially important because today’s pow- 
erful multicore servers are typically underutilized. 
Server consolidation enables organizations to take 


advantage of this unused computing potential. It also 
improves manageability by reducing the number of 
server systems that need to be managed. 

Another driving force behind virtualization is en- 
hancing business continuity. Because virtualization 
abstracts the server OS from the underlying hardware, 
it enables more flexibility for backup and disaster re- 
covery. For instance, a server failure in a tradition 
SQL Server hardware-based installation typically 
requires a secondary backup system. Some organiza- 
tions use failover clustering for server-level hardware 
protection, but failover clustering is typically limited 
to the most important servers. A hardware failure for 
most servers means bringing in a replacement server, 
reimaging the system, and restoring the latest back- 
up—a process that can take several hours at the very 
minimum. Virtualization technology can speed up 
the recovery process by allowing an organization to 
bring a copy of the SQL Server virtual machine (VM) 
online on another virtual server—a process that takes 
seconds instead of hours. In addition, features such 
as Quick Migration and Live Migration (which will 
be available in Windows Server 2008 R2) enable busi- 
nesses to address planned downtime scenarios with 
little or no interruption of services to the end user. 

Another popular use for virtualization is using it 
for development servers. You can also use virtualiza- 
tion for staging servers used to test system updates 
and application changes before deploying them into 
production. 
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What you 


For all its potential benefits, many organizations 
running SQL Server have been slow to adopt virtu- 
alization for their database servers. The trade-off for 
the benefits of virtualization is performance. Many 
businesses are concerned that running SQL Server on 
a VM won't provide the performance that their us- 
ers require. But this was more the case in the early 
days of virtualization. Today’s second generation 
hypervisor-based virtualization products like Micro- 
soft’s Hyper-V and VMware’s ESX Server have been 
successfully used in many production virtualization 
implementations. 

In addition, many people are under the mistaken 
impression that Microsoft doesn’t support running 
SQL Server in a virtual environment. This used to be 
true for SQL Server 2005 and the older Virtual Server 
2005 product. However, that all changed with the 
release of the Hyper-V virtualization platform and 
SQL Server 2008. Microsoft supports SQL Server 
running under Hyper-V, ESX Server, and other vir- 
tualization platforms that are certified through Mi- 
crosoft’s Server Virtualization Validation Program 
(SVVP). To learn more about Microsoft’s support 
policy for running SQL Server in a virtualized envi- 
ronment, see the Microsoft article “Support policy 
for Microsoft SQL Server products that are running 
in a hardware virtualization environment” (support 
-microsoft.com/kb/956893). 

Finally, the licensing for running SQL Server on a 
VM can be confusing. However, when using the right 
Windows Server and SQL Server editions, there can 
be significant licensing advantages when using virtu- 
alization. For more information about licensing, see 
the sidebar “Virtualization Licensing for Windows 
Server and SQL Server.” 


Optimizing VM I/O 

Performance is the number one concern for running 
SQL Server on a VM. The bottleneck of most data- 
base servers is I/O and not processing power. This is 
especially true for SQL Server running on a VM. Tak- 
ing steps to improve the I/O throughput is one of the 
most important factors when setting up SQL Server 
VMs. The first issue is choosing the right type of Virtu- 
al Hard Disk (VHD) for your SQL Server installation. 
Hyper-V supports the following types of VHDs: 

* Dynamic VHDs. A dynamic VHD uses only the 
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need to know 


VIRTUALIZATION VS. MULTIPLE 
INSTANCES 


here are two basic ways to consolidate SQL Server systems: consolidation 

through virtualization and consolidation by using SQL Server’s multiple 
instances feature. At first pass, using multiple instances and running instances of 
SQL Server on virtual machines (VMs) might seem to be very much the same, but 
there are some substantial differences. 

Multiple instances have been supported since SQL Server 2000. The SQL Server 
Express and Workgroup editions support a maximum of 16 instances, whereas the 
Web Standard and Enterprise editions support a maximum of 50 instances. Mul- 
tiple instances 
support essen- 
tially allows 
multiple copies 
or instances of 
SQL Server to 
be installed on 
asingle OS. 

You can see an 
overview of SQL 
Server’s multiple 


O 


SQL SQL SQL 
Server Server Server 


Microsoft Windows Server 2003 


Physical Server 


Figure A instances support 
Multiple i in Figure A. 
sig Een En Virtualization 


is quite different. With virtualization, each VM has its own virtual hardware and 
guest OS. SQL Server is then installed on the guest OS. Virtualization allows you 
to run different versions of the guest OS on different VMs as well as different 
versions of SQL Server or other applications. You can see an overview of running 
SQL Server on multiple VMs in Figure B. 

Multiple instances support typically provides better performance than virtualiza- 
tion because there’s less system overhead. However, virtualization provides better 
flexibility and availability by enabling VMs to be moved between different hosts. It's 
worth noting that multiple instances and virtualization are not mutually exclusive. 
You can install multiple instances of SQL Server on a VM. For more information 
about the differences between virtualization and multiple instances, see “VMs vs. 
Multiple SQL Server Instances” (December 2007, InstantDoc ID 97439) and “VMs 
vs. Multiple SQL Server Instances, Round 2” (March 2008, InstantDoc ID 95148). 
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pass-through VHD 
in Hyper-V 
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You can change how this virtual hard dsk is attached to the virtual machine. Ifan 
operating system is installed on this dsk, changing the attachment might prevent the 


starting. 
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You can compact or convert a virtual hard disk by editing the .vhd file. Specify the 
full path to the file. 


© Virtual hard disk (.vhd) fie: 


[oik 193.37 G8 Bus iLunOTargetO v 


P 1f the physical hard dsk you want to use is not isted, make sure that the 
disk is offine. Use Disk Management on the physical computer to manage 
physical hard disks. 


To remove the virtual hard disk, dick Remove. This disconnects the disk but does not 


delete the .vhd file. 
Remove | 


space required and automatically expands when 
more space is needed. 

e Fixed VHDs. A fixed VHD is sized to its full 
capacity upon creation. 

Pass-through VHDs. A pass-through VHD is con- 
figured to use storage on the virtualization host. 


DD Processor 


You can modify the number of virtual processors based on the number of processors on 


the physical machine. You can also modify other resource control settings. 


You can use resource controls to balance resources among virtual machines, 
Virtual machine reserve (percentage): E 
Percent of total system resources: [so 
Virtual machine limit (percentage): [100 
Percent of total system resources: [ 
ED 


Relative weight: 


processor functionality to run an older operating system such az Windows NT 
on the wrtal machine. 


I” Limit processor functionality 


For test and development servers, dynamic VHDs 
are usually best because they reduce the disk space 
that’s required. For SQL Server systems running 
production workloads, fixed or pass-through VHDs 


are best. They provide better performance because 
they won’t experience the performance impact 
that occurs when a dynamic VHD is expanded. 
Microsoft’s SQL Server Customer Advisory Team 
(SQLCAT) did some in-depth testing of SQL 
Server 2008 running on Hyper-V. SQLCAT pro- 
vided several performance recommendations, one 
of which was using pass-through disks for best 
performance under Hyper-V. To read the com- 
plete SQLCAT test results, go to sqlcat.com/white 
papers/archive/2008/10/03/running-sql-server- 
2008-in-a-hyper-v-environment-best-practices- 
and-performance-recommendations.aspx. 

Hyper-V’s New Virtual Machine Wizard and New 
Virtual Hard Disk Wizard don’t let you create pass- 
through disks. To create a pass-through VHD, open 
the Hyper-V Manager. Right-click the VM that you 
want to modify to use pass-through VHDs, then se- 
lect the virtual hard drive to modify. Select the Physi- 
cal hard disk radio button, then choose the physical 
disk that you want to use as the pass-through disk. 
The physical disk can be a local drive, or it can be on 
a SAN. The drive must be in an offline state to be in 
the drop-down list. You can see an example of how to 
configure a Hyper-V VM to use pass-through VHDs 
in Figure 1. 

The actual I/O performance that you get is highly 
dependent on the underlying storage subsystem that 
youTe using, but all things being equal, pass-through 
VHDs provide the best performance. However, they 
don’t offer the flexibility of fixed VHDs and aren’t 
as portable between systems. Fixed VHDs also offer 
very good performance and are a better solution for 
most SQL Server workloads. 

The type of storage you use for your VHDs will 
make a big difference. Using a multipath-capable 
SAN for your VHD storage will provide the best 
performance. SANs are highly scalable and let you 
spread the I/O over multiple drive spindles. Complete 
information for configuring SQL Server to use a SAN 
is beyond the scope of this article. For more informa- 
tion about properly configuring SQL Server to use a 
SAN for I/O, see “SQL Server on a SAN” (January 
2006, InstantDoc ID 48486) and “What’s the Best 
Way to Carve Up a SAN?” (September 2007, Instant- 
Doc ID 96555). 


Assigning Virtual 

Processors 

Next to I/O, the virtual processor is probably the next 
most important VM performance factor. Assigning 
the number of virtual processors for a SQL Server 
VM is a relatively straightforward task for systems 
with four CPU cores or less. Hyper-V supports up to 
four virtual CPUs per VM. To assign multiple virtual 
processors to your VM, open the Hyper-V Manager. 
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Right-click the VM, select Settings, then click Proces- 
sor. In the Number of logical processors drop-down 
list, select the number of virtual processors to be used 
by the VM, as Figure 2 shows. Note that you can as- 
sign only as many virtual processors to a given VM as 
there are physical cores in the systems. For instance, 
if you have a quad-core system, you can’t assign more 
than four virtual CPUs per VM. 

In a server consolidation environment, it’s es- 
pecially important to ensure that the virtualization 
host has adequate processing power to handle all 
the active VMs. To calculate the required processing 
power needed by the Hyper-V host, you can use the 
formula 


CPU Cores x CPU Speed x CPU Utilization = 
Total CPU 


where CPU Cores is the number of CPUs in the phys- 
ical server to be virtualized, CPU Speed is the CPU’s 
speed, and CPU Utilization is the average utilization 
for that CPU. You need to perform this calculation 
for all the servers that will be running on the Hyper-V 
host. Then, you need to add the Total CPU values to- 
gether to get an overall estimate of the CPU require- 
ments for all the VMs, using the formula 


Sum(Total CPU) = Overall CPU 


Although specific requirements will vary, a good rule 
of thumb is to reserve 25 percent of the host’s pro- 
cessing power. So, you can determine the host’s virtu- 
alization capability using the formula 


Overall CPU x 25% 


Ideally, you should strive for a 1-to-1 mapping be- 
tween virtual CPUs and physical cores, although 
most server consolidation environments have a higher 
ratio. One thing to bear in mind is that adding virtual 
CPUs to your VMs won't result in linear performance 
improvements. In other words, adding a second vir- 
tual CPU won't double the performance of your VM. 
You're far more likely to see more modest gains. Some 
experts cite improvements of 10 to 15 percent by add- 
ing virtual CPUs. 


Calculating VM Memory 

To size the memory for a SQL Server VM, simply 
create the VM with the same amount of memory 
allocated to that VM on the Hyper-V host. If the 
host is memory-constrained, this is a good time to 
add more memory. Hyper-V supports up to 64GB 
per VM, and the host can support up to 1TB of 
physical memory. To take advantage of memory be- 
yond 4GB, the VM’s guest OS must be 64-bit. It’s 
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VIRTUALIZATION LICENSING FOR 
WINDOWS SERVER AND SQL SERVER 


L icensing Microsoft products always seems to be an overly complicated affair 
and virtualization doesn’t make the task any easier. As a basic rule of thumb, 
you should consider a virtual machine (VM) to be the same as a physical server. 
While there are exceptions, you should plan on buying licenses for the OS that 
runs on the VM, server software such as SQL Server, and any CALs. 

However, there are exceptions. The Windows Server 2008 editions allow the 
following number of virtual instances with no additional licenses: 
e The Standard Edition allows one virtual instance. 
e The Enterprise Edition allows four virtual instances. 
e The Datacenter Edition allows an unlimited number of virtual instances. 


This holds true no matter what virtualization technology you use. 
SQL Server offers three licensing options: 

e Per server and client device 

e Per server and user 

e Per processor 


Per server licensing is rather straightforward for virtualization. Each VM running 
SQL Server requires a server license. The licenses for client devices or users for a 
VM installation are the same as those for a physical server. 

When using SQL Server’s per-processor licensing for a VM, you license the 
virtual processors in the VM—not the physical CPU in the virtualization host. 
However, there’s an exception when using SQL Server 2008 Enterprise Edition. 
If you opt for per-processor licensing for this edition and you license all of the 
CPUs in the host, you can have an unlimited number of SQL Server instances 
running on VMs with no additional licensing costs. You can find more information 
about SQL Server licensing at download.microsoft.com/download/1/e/6/ 
1e68f92c-f334-4517-b610-e4dee946ef9 1/2008%20SQL%20Licensing%20 
Overview%2Ofinal.docx. 


InstantDoc ID 102076 


important to note that the combined RAM require- 
ments of the active VMs can’t be more than the 
amount of physical RAM in the Hyper-V host. For 
instance, a 32GB Hyper-V host couldn’t support five 
8GB VMs. 

To calculate the Hyper-V host memory require- 
ments, you first need to add about 32MB of memory 
for virtualization overhead to each VM’s RAM, then 
aggregate all the VMs’ RAM. In addition, you need 
to reserve memory for the host. As a general rule of 
thumb, reserving 512MB of RAM for the Hyper-V 
host is adequate. The hypervisor itself requires about 
300MB. So, you can use the following formula to cal- 
culate host memory requirements: 


Sum(VM RAM + 32MB) + 512MB + 3ØØMB 


You can find a handy Excel spreadsheet for calculat- 
ing Hyper-V host memory requirements at cid-2095 
eac3772c41db.skydrive.live.com/self.aspx/Public/ 
Hyper-V%20RAM%20Calculator.xls. A useful tip to 
remember regarding memory is that adding memory 
to the VM can help boost its I/O performance by en- 
abling SQL Server to use the additional memory for 
caching. 
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Using Hyper-V Synthetic 
Network Devices 

Another configuration setting that can have a big im- 
pact on server performance under Hyper-V is mak- 
ing sure the SQL Server VM uses Hyper-V’s new 
synthetic network devices. Synthetic network devices 
take full advantage of the Hyper-V high-performance 
VM bus architecture. For SQL Server virtualization, 
it’s important that the VM uses the new synthetic net- 
work adapter. The legacy network drivers are provid- 
ed in Hyper-V for compatibility with older OSs, but 
they're processed by threads in the parent partition 
and don’t provide the same level of performance as 
the newer synthetic network devices. 

Drivers for the new synthetic network devices are 
part of Windows Server 2008 and Windows Server 
2008 R2. If you use them as guest OSs in your VMs, 
the synthetic device drivers will be present. However, 
these synthetic device drivers aren’t part of Windows 
Server 2003 or earlier server OSs. If you're running 
Windows Server 2003 as your guest OS, you need to 
install the Hyper-V Integration Components. The In- 
tegration Components include the synthetic network 
device drivers as well as enhanced mouse and video 
support and host time-synchronization. You can en- 
sure that you're using the synthetic devices by con- 
necting to the VM, then opening Device Manager. 
Expand the Network adapters node and check for the 
value Microsoft VMBus Network Adapter, like you 
see in Figure 3. 

To install the Integration Components, open the 
Hyper-V Manager, right-click the VM, and select 
Connect. In the Virtual Machine Console that ap- 
pears, select Action, Insert Integration Services Setup 
Disk. Then just click Next throughout the wizard 
screens. 


Capacity 

One of the easiest performance factors to 
overlook when performing server consoli- 
dation is your available network adapter 
capacity. When you consolidate multiple 
physical SQL Server systems on a single 
server, you're essentially taking all the net- 
work traffic that was previously directed 
to multiple servers and their NICs and 
funneling them through fewer NICs on 
the Hyper-V hosts. To calculate the net- 
work capacity required, you can use the 
formula 


VM NICs x NIC Speed x NIC 
Utilization = 
NIC Requirements 


where VM NICs are the number of virtual machines 
that will be active on the host, NIC Speed is the speed 
of the NICs, and NIC Utilization is the average net- 
work utilization of those NICs. 

To calculate the host’s total network capacity, 
take the number of NICs in the host and multi- 
ply that number by their speed, as is shown in the 
formula 


Number of NICs x NIC speed = 
Total NIC Capacity 


If you divide the Total NIC Capacity by the NIC Re- 
quirements, you'll have an idea of how many Hyper-V 
host network adapters you'll need. Ideally, you should 
allocate a separate host NIC for each VM. Another 
recommended practice is to reserve an additional 
network adapter for Hyper-V host management. For 
added security, it’s a good idea to put the manage- 
ment NIC on a separate network and not with the 
NICs used by the production VMs. 

If you’re familiar with ESX Server, you might be 
wondering about NIC teaming. NIC teaming lets you 
bind multiple host network adapters together. Al- 
though Hyper-V doesn’t support NIC teaming, you 
can implement it if you buy the right NICs. 


Virtually SQL 
Virtualization is a mature technology that’s ready for 
prime time. While the Hyper-V platform can provide 
production-level performance for SQL Server, you 
still need to be sure to correctly size your VMs as 
well as take advantage of Hyper-V features such as 
multiple virtual processors, 64-bit guest support, and 
synthetic network devices. [SQL 
InstantDoc ID 102074 
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CLR-Based 
Solution 


for Running Aggregates 


Good performance for large partitions 


| the past couple of months I started a series of 
n articles covering solutions to running aggre- 
gates. This series focuses on analyzing performance 
aspects of the various solutions—in particular their 
algorithmic complexity. In other words, this series 
explores the way the solutions scale when different 
variables such as the number of aggregates, parti- 
tions, or rows per partition change. In “Subqueries 
and Joins for Running Aggregates” (May 2009, 
InstantDoc ID 101623) I presented the problem and 
discussed set-based solutions using subqueries and 
joins, and in “Set-Based vs. Cursor-Based Solutions 
for Running Aggregates” (June 2009, InstantDoc 
ID 101736) I covered a solution based on T-SQL 
cursors. This month I present a Common Language 
Runtime (CLR)-based solution and explain the cir- 
cumstances in which it would perform better than 
the other solutions. 


Getting Started 
For your convenience, Web Listing 1 (www.sqlmag 
.com, InstantDoc ID 102097), Web Listing 2, and 
Web Listing 3 contain the code necessary to create 
and populate the Sales table that is used in the 
article’s examples. Run the code in Web Listing 1 
to create the table. Run the code in Web Listing 2 
to create the GetNums function that is used to 
populate the Sales table with sample data. Use the 
code in Web Listing 3 to populate the Sales 
table with sample data, adjusting the number of 
partitions (employees) and number of rows per 
partition (days per employee) based on your needs. 

As a quick reminder, the Sales table contains 
a row for each employee and date, with the 
employee ID, sales date, quantity, and value. The 
running aggregate that is used as the basis for the 
performance analysis is a running sum of quantity 
(or value) per employee and date; that is, for each 
employee and date, calculate the total quantity 
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from the beginning of the employee’s activity until the 
current date. 


CLR-Based Solution 
SQL Server 2005 introduced CLR integration, which 
lets you develop routines such as functions and pro- 
cedures using .NET code. CLR integration comple- 
ments T-SQL in areas where it is weak, such as string 
manipulation and iterative logic. T-SQL is typically 
preferable for tasks that involve data manipulation, 
both in terms of performance and code complexity. 
However, there are a few uncommon cases that involve 
data manipulation tasks in which CLR-based solutions 
sometimes outperform T-SQL solutions. Running 
aggregates is such an example. The reasons that in 
certain cases CLR-based solutions can out- 
perform T-SQL solutions have to do with the 
way the optimizer currently treats set-based 
solutions, and because of lack of support 
for ANSI SQL set—based language elements 
that lend themselves to better optimization. I’ll present 
the CLR-based solution and compare its performance 
to the solutions that I provided in previous columns. 
In a future article I'll describe the missing language 
elements and explain why they have great potential to 
perform better than any other solution. 

Listing 1, page 18, shows a CLR-based solution 
for our problem using C# code. The solution defines 
a stored procedure called SalesRunningSum that 
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LISTING |: CLR-Based Solution Using CH 


CLR-BASED SOLUTION 


using System; 

using System.Data; 

using System.Data.SqlClient; 
using System.Data.SqlTypes; 

using Microsoft.SqlServer.Server; 


public partial class StoredProcedures 


a 


D 


[Micro 
public 
{ 


usin 


soft.SqlServer.Server.Sq]lProcedure] 
static void SalesRunningSum() 


g (SqlConnection conn = new SqlConnection("context 


connection=true;")) 


Sq 
COI 
COI 


Sq 
co 
co 
co 
co 
Sq 
Sq 
co 
Sq 


Sq 
Sq 


wh 


} 
Sq 


1Command comm = new SqlCommand() ; 
mm. Connection conn; 
mm.CommandText = @"" + 

"SELECT empid, dt, qty “ + 

"FROM dbo.Sales " + 

"ORDER BY empid, dt;"; 


MetaData[] columns = new SqIMetaData[4]; 


umns[@] = new SqlMetaData("empid", SqlDbType. Int) ; 
umns[1] = new SqlMetaData("dt", SqIlDbType.DateTime) ; 
umns[2] = new Sq|lMetaData("qty", SqlDbType.Int) ; 
Jumns[3] = new SqiMetaData("sumqty", SqlDbType.BigInt) ; 


DataRecord record = new SqIDataRecord(columns) ; 
Context.Pipe.SendResultsStart(record) ; 
nn.Open() ; 


DataReader reader = comm.ExecuteReader(); 


Int32 prvempid = ð; 
lInt64 sumqty = ð; 


ile (reader.Read(Q)) 


SqlInt32 empid = reader.GetSqlInt32(9) ; 
SqlInt32 qty = reader.GetSqlInt32(2); 


if (empid == prvempid) 
{ 
sumqty += qty; 


else 
{ 

sumqty = qty; 
prvempid = empid; 


record.SetSqlInt32(@, reader.GetSqlInt32(9)); 
record.SetSqlDateTime(1, reader.GetSq|lDateTime(1)) ; 
record.SetSqliInt32(2, qty); 

record. SetSqlInt64(3, sumqty) ; 


Sq1Context.Pipe.SendResultsRow(record) ; 


1Context.Pipe.SendResultsEnd() ; 


calculates a running sum of quantity for each employee 
and date. Web Listing 4 has the Visual Basic ver- 
sion of the solution in case that’s your language of 
preference. 

The procedure’s code defines a SqlConnection 
object called conn that uses the current user’s connec- 
tion to SQL Server by specifying the option context 
connection=true. The code then defines a SqlCom- 
mand object called comm that uses conn as its con- 
nection. The code sets the CommandText property of 
comm to a query against the Sales table that retrieves 
the sales rows (employee ID, date, and quantity) 
ordered by employee ID and date. The code then 
defines an array called columns containing four Sql- 
MetaData objects with the definitions of the four result 


columns that the procedure will return in its result set: 
empid, dt, qty, and sumqty. The code continues by 
defining a SqlDataRecord object called record based 
on columns. The code then marks the beginning of the 
result set that will be streamed to the client by invoking 
the SqlContext.Pipe.SendResultsStart method based 
on the record parameter, and opens the connection. 

From this point on the code uses a data reader 
(SqlDataReader object called reader) to iterate through 
the records returned by the query defined earlier in 
comm’s CommandText property. In each iteration 
the code accumulates the quantity in a variable called 
sumqty as long as the employee ID didn’t change; 
otherwise it overwrites the value of sumqty with the 
current row’s quantity. The code sets the four attributes 
of the result record with the current employee ID, date, 
quantity, and running sum of quantity, and sends the 
result row to the client by invoking the SqlContext. Pipe 
.SendResultsRow method based on the record param- 
eter. Finally, the code marks the end of the result set 
and returns the pipe to its initial state by invoking the 
SqlContext.Pipe.SendResultsEnd method. 

You might be curious why I used a stored procedure 
to implement the solution rather than a table-valued 
function (TVF); after all, because the output is a 
result set, it would be more natural to use a TVF and 
consume the result from an outer query. The reason 
for this is that a stored procedure is more efficient with 
large result sets or under load because it relies on a 
“streaming” implementation that uses memory only 
for one row at a time. As for a TVF solution, the Con- 
text Connection can be open only in the actual TVF 
method, not in the Fill Row method. In the Fill Row 
method you must fill an intermediate result collection, 
which requires allocation of a lot of memory. 

Deploy the SalesRunningSum stored procedure 
in the tempdb database. For details about deploying 
user-defined assemblies in SQL Server, see “5 Steps for 
Developing and Deploying CLR Code in SQL Server” 
(April 2006, InstantDoc ID 49479). 

Assuming you already created and populated the 
Sales table, use the following code while connected to 
tempdb to test the procedure: 


EXEC dbo.SalesRunningSum; 


Performance 

If you think about it, the CLR-based solution is just 
another version of a cursor—only a much faster one 
compared with the T-SQL cursor. The reason the CLR 
data reader is faster than the T-SQL cursor is that the 
overhead of each record manipulation of the former 
is lower than that of the latter. Recall that I expressed 
the cost of previous solutions based on elements such 
as number of partitions (represented by the letter p), 
average number of rows per partition (r), and number 
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of aggregations to calculate (a). I used the letter o to 
express the overhead that is associated with the manip- 
ulation of each record using a T-SQL cursor. You can 
then express the overhead of each record manipulation 
with a CLR data reader as c, where c is smaller than 
o. Soon you will be able to tell how much smaller c is 
than o based on performance tests. 

The cost of processing p partitions with average 
partition size of r rows using a T-SQL cursor can be 
expressed as pro. Similarly, you can express the cost of 
the CLR-based solution as prc. 

Last month I mentioned that the performance of 
the T-SQL cursor-based solution isn’t significantly 
affected when you increase the number of aggregations 
(a). As expected, my performance tests show the same 
applies to the CLR-based solution. To give you a sense, 
I got only about 15 percent performance degradation 
when calculating four aggregates instead of one. 

As for increasing the number of partitions (p), as 
expected, the performance degradation was linear— 
which makes sense, because increasing the number of 
partitions by a factor of f should give you the cost pfrc. 
Figure | shows the results of three performance tests 
comparing the set-based solution using subqueries, 
the T-SQL cursor-based solution, and our new CLR- 
based solution. I started with 10,000 partitions with 
a constant partition size of 10, and kept increasing 
the number of partitions by 10,000 every time until I 
reached 100,000 partitions. 

The graph clearly shows that all three solutions have 
linear complexity with respect to an increase in the number 
of partitions. You can make two other interesting obser- 
vations by inspecting the graph. First, the CLR-based 
solution is more than four times faster than the T-SQL 
cursor-based solution. Second, with our very small 
partition size of 10 rows, the set-based solution using 
subqueries is faster than the CLR-based solution. 

The cost of the set-based solution is expressed as 
pr + 1°)/2. With respect to an increase in the parti- 
tion size (r) by a factor of f its complexity is close to 
quadratic—p(rf + (rf})/2. With the cursor solution, 
the complexity is linear, because pro becomes prfo. The 
performance tests that I covered last month showed 
that up to a partition size of about 500 rows the set- 
based solution was more efficient than the cursor-based 
solution, whereas the reverse was true with a partition 
size of more than 500 rows. Figure 2 shows the effect of 
changing the partition size on all three solutions. I used a 
constant number of partitions (1,000) and partition sizes 
varying from 10 to 1,000. Interestingly, the point where 
the CLR-based solution becomes more efficient than the 
set-based one is about 15 rows per partition. 


When to Use It 
The CLR-based solution to running aggregates 
has close to constant complexity with respect to an 
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Graph showing effect of change in partition size 


increase in the number of aggregates to be calculated, 
and linear complexity with respect to an increase 
in number of partitions or in partition size. In my 
performance tests, the CLR-based solution was more 
than four times faster than the T-SQL cursor-based 
solution. Up to a partition size of 15 rows, the set- 
based solution was faster than the CLR solution, and 
the reverse was true beyond 15 rows. If performance 
is your main goal, you should use the set-based solu- 
tion only with very small partition sizes; otherwise, 
you should use the CLR-based solution. Next month 
I'll cover another solution to running aggregates that 
involves nested iterations, and I'll explain which lan- 
guage elements that address running aggregates exist 
in standard SQL but aren’t yet implemented in SQL 
Server 2008. [SQL] 
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for your business intelligence (BI)— 
processing needs with the all-in-one HP 
BI Sizer for Microsoft SOL Server 2005. 


With the HP BI Sizer you can: 


+ Calculate your configuration using atomic modeling 
algorithms— no mapping to a matrix of reference 
configurations 

+ Convert your user workload profile into the standard 
calculated Query Rate Per Hour (OPH) rate 

+ Modify I/O estimates—producing more accurate and 
real-world sizing estimates 

+ Custom configuration recommendations—generated for 
your unique environment 
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7 instructional screencasts—hosted by David Chernicoff 
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Lessons Learned 


from 


Data we re 


design 


These business and design best practices can 
prevent costly data warehouse mistakes 


your mistakes, but mistakes made in a data 

warehouse/business intelligence (BI) environment 
tend to be very expensive and possibly career-killers. 
Building a data warehouse isn’t a simple task; it takes 
a village to build one from which you can extract 
viable information. 

A data warehouse and the operations that build 
and maintain it are a combination of business con- 
siderations and design best practices. It can be a huge 
challenge to balance these requirements and still be 
able to deliver valuable content to end users. Make 
no mistake—not everyone is suited to work in a data 
warehouse environment, but the work is stimulating 
and the rewards are satisfying. Here are some of the 
lessons I’ve learned from designing data warehouses 
and BI environments. 


i t’s often said that the best way to learn is from 


Get Buy-in from Business and 
Technical Groups 
When designing a data warehouse, business alignment 
must come first because if the product that’s delivered 
at the end of the day doesn’t reflect the goals and 
purposes of your organization or meet users’ needs, 
then the data warehouse is a failure. To give your 
data warehouse any chance of success you need to get 
buy-in from the business and technical sides of 
your organization, from the top-level executives 
to the front-line worker bees. Don’t believe that 
“if you build it they will come.” An IT-driven, 
IT-sponsored, IT-centric data warehouse/BI 
project is leaving out the most important facet 
of a BI solution—alignment to business needs 
and requirements. Involving business decision 
makers throughout the project ensures that the 
data warehouse will deliver meaningful, useful 
information to business users. 

At the beginning of your data warehouse 
project, establish a steering committee that’s 
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composed of both business decision makers and 
technical decision makers, and consider establishing a 
BI competency group that reports to a non-technical 
C-level manager, such as the COO or CFO. Use these 
people as a resource throughout the project's lifetime 
and even after the system is in production. 


Maintain Data Integrity 

When designing your data warehouse environment, 
it’s important to minimize data replication in the 
table design (by using conformed dimensions, for 
example) and data inconsistency by vetting source 
data and adhering to Master Data Management 
(MDM) standards. (For more information about 
MDM, see “Master Data Management,” January 
2007, InstantDoc ID 94193, or “Master Data Man- 
agement Challenges,” www.sqlmag.com, InstantDoc 
ID 97881.) 

Poor data quality is one of the biggest hurdles when 
it comes to data warehouse adoption rates. You can 
do the following to mitigate poor data quality: 

e Identify which business units own specific data and 
are responsible for its quality. 

e Appoint a data steward in each business unit and 
include “data quality” in the job description. 

e Make sure your extraction, transformation, and 
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Determine what level of 
data quality is considered 
to be “good enough” by 
involving the business units 
and finding out what their 
data analysis needs are. 


DATA WAREHOUSE DESIGN 


loading (ETL) routines are scouring for data 
quality problems and either correcting the data 
before it’s loaded into the warehouse or blocking 
it entirely. 

e Audit and verify data as necessary by setting up 
back-end routines that periodically reconcile ware- 
house data with source data, taking into account 
any inaccuracies that are inherent to the source 
data store. 

e Determine what level of data quality is considered 
to be “good enough” by involving the business 
units and finding out what their data analysis 
needs are. A business unit that’s using only highly 
summarized data might not need data to be 
cleansed to the same level as a unit that’s drilling 
down into the detail level. 


Provide User-Friendly 
Interfaces 

Strive for user-friendly interfaces because a data 
warehouse without users is a failure, and a BI system 
that’s too complicated 
to learn and incorpo- 
rate into everyday use 
is a disaster. Providing 
highly summarized visual 
information to the busi- 
ness decision makers is 
one of the things that 
every data warehouse/BI 
system must be able to 
do. Dashboards, which 
are highly visual real-time 
reflections of operational 
activities, and balanced scorecards, which are non— 
real-time reports that blend the operational, mar- 
keting, developmental, and financial aspects of 
an organization, are both highly summarized and 
symbolic of a data warehouse. However, without 
an overarching strategy, dashboards and balanced 
scorecards can easily become flashy facades for silo- 
specific “islands of influence” that aren’t connected to 


enterprise objectives. 


To avoid these islands of influence, get your 
strategy map in place. A strategy map links the long- 
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term goals and objectives 
of an enterprise with its 
operational activities, 
illustrating the cause- 
and-effect relationship 
between different Key 
Performance Indicators 
(KPIs) that you'll see on 
a balanced scorecard and 
possibly on dashboards. 
Then, make reports as 


visual as possible—a picture is truly worth 1,000 
words when it comes to your data warehouse. 


Reduce Operational Costs 

It’s important to make your data warehouse environ- 
ment easy and cost-effective to support and maintain. 
Design implementation costs can quickly get out 
of control. It’s not unusual to have an 80 percent/ 
20 percent division in cost between the back-end 
ETL and table building processes and the front-end 
applications and reports. Reduce design implementa- 
tion costs by creating a set of reusable designs and 
repeatable ETL processes so that you don’t have to 
start each data warehouse/BI project with a clean 
slate. Simplify the design by conforming dimensions 
and standardizing the granularity of the fact tables 
wherever possible; do not over-engineer the tables. 
Overly complex designs can quickly drive costs to the 
point where the entire project fails. 


Make Your Data Warehouse 
Scalable 

Your business is going to change over time, so you 
need to make sure your data warehouse/BI environ- 
ment can, too. According to the Gartner Group 
(Gartner RAS Core Research Note G00161231, 
James Richardson and Bill Hostmann, September 12, 
2008), during the first year of a data warehouse/BI 
implementation, you should expect users to request 
changes that will affect 35 percent to 50 percent of 
the application functions. Anticipating and managing 
change will be one of your biggest challenges. A good 
change-enhancement request system should be at the 
top of your wish list so that you can better manage 
the changes that are inevitable with a data warehouse/ 
BI implementation. 

In addition, hardware and software costs need 
to be balanced with scalability. The enterprise will 
change and so will your data warehouse/BI environ- 
ment. If the data warehouse isn’t a core competency 
or the primary line of business, then you should 
seriously consider running your data warehouse in 
the cloud. At the time of this writing, Microsoft 
doesn’t have a viable cloud computing solution for 
large systems, but there are other options available, 
including GoGrid, Amazon/Pentaho, and VMware’s 
newly-released vSphere 4 cloud OS. With the right 
choice of cloud vendor, you can end up hiring 
expertise in business continuity, system availability, 
security, data storage and archiving, and scaling the 
data warehouse/BI environment as needed, all for a 
pay-as-you-go price that generally equates to a one 
or two magnitudes decrease in implementation and 
maintenance costs compared with doing it yourself. 
Your time-to-terabyte will be measured in days rather 
than weeks or months. 
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Comply with Internal and 
External Standards 

Having an enterprise overview of how and where data 
is used makes figuring out if you’re in compliance 
much easier. Compliance is another instance in which 
MDM becomes a must-have component in your 
organization. Although complying with both internal 
and external standards seems like a big headache 
prior to a security breach or event, you'll be glad you 
did in the event of an attack. Every organization has 
decentralized data management, so I recommend 
using compliance regulations such as Sarbanes-Oxley 
to level the playing field. An outside influencer can 
often do more to predicate cooperation than internal 
directives. Measuring compliance can be a tough and 
time-consuming job, so you might want to look into 
risk management software such as RiskWatch, which 
was originally built to track HIPAA compliance but 
now includes features that evaluate total enterprise 
risk. 


Data Warehouse Design Tips 
It’s not possible to describe or even list all the tech- 
nical design issues that you could encounter when 
building a data warehouse. However, the following 
data warehouse design tips can help you minimize 
problems in your data warehouse: 
e Don’t confuse measures with context. Measure- 
ments (measures) are usually numeric, can be 
aggregated, and are stored in fact tables, whereas 
context is descriptive and open-ended. Divide the 
context into logical groupings such as customers, 
store locations, and time, and store them as dimen- 
sions. (For more information about measures, see 
“Data Warehousing: Measuring the Facts,” Sep- 
tember 2007, InstantDoc ID 96336.) 
Define the grain of the fact table. To do so, start by 
determining exactly what a fact table represents: For 
example, is it an individual insurance policy transac- 
tion or a weekly inventory count? Define the grain 
of a fact table in terms of a clear business objective 
and a set of business rules. Don’t be afraid to store 
too fine a level of detail if you need to because 
you can always aggregate the details, but you can’t 
extrapolate details from summarized data. 
Don’t snowflake when a simple star schema will 
do. The purpose of data warehouse design isn’t 
to normalize the data (as you attempt to do when 
you snowflake) but rather to organize the data 
in such a way to be intuitively meaningful to end 
users. That often means you need to de-normalize 
the dimensions. (You can learn more about dimen- 
sions by reading “Data Warehousing: Dimension 
Basics,” October 2007, InstantDoc ID 96846.) 
e Partition fact tables on a date key. Partitioning fact 
tables on a date key makes data maintenance much 
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easier. If you need to delete or archive historical 
data, you can “retire” the oldest partitions. Index 
maintenance is faster if the data is partitioned, as 
is data retrieval because SQL Server has to look 
only in relevant partitions for data. In addition, 
always align the partitioned indexes by using the 
same partition schema that you used to partition 
the base table. If you partitioned the base table on 
a date column, then index the date column and 
use the same partition 
scheme for the index. 

I recommend keeping 
empty partitions at 
both ends of the table 
so that it’s easy to add 
and remove partitions, 
and choose the parti- 
tion grain carefully 
because there’s a max- 
imum of 1,000 parti- 
tions in SQL Server 
2008 and 2005. (To 
find out more about 
partitioning fact tables, 
see “Data Warehousing: Horizontally Partitioning 
the Fact Table,” April 2008, InstantDoc ID 98007.) 
Cluster fact tables on a date key. Doing so lets you 
efficiently retrieve historical slices of data and 
supports queries that are used to populate cubes— 
most of which have a major time component. (For 
more information, see “Indexing the Data Ware- 
house,” August 2008, InstantDoc ID 99330.) 
Query the star schema using WHERE on the par- 
titioning key. For instance, if the fact table (Sales) 
has been partitioned by date (dateKEY), your 
query would look like 


SELECT... WHERE Sales.dateKEY between '2009-01-91' 
and '2009-@1-31' 


Because SQL Server 2008 and 2005 Developer Edi- 
tion and Enterprise Edition are partition-aware, 
you'll want to use literals, not variables, to point 
the query analyzer to the correct partition and to 
optimize SQL Server 2005’s idiosyncrasies regarding 
multi-partition query scans. 


Data Warehouse Design Best 
Practices 
There are so many lessons still to be learned about 
designing data warehouses. From these lessons we 
glean both business and technical best practices. As 
you discover new, and possibly better, ways to per- 
form these tasks, spread the wealth by sharing your 
solutions with your peers. SOL 
InstantDoc ID 102119 
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Don't be afraid to store 
too fine a level of detail if 
you need to because you 
can always aggregate 

the details, but you can’t 
extrapolate details from 
summarized data. 
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Diving Deeper into Wait Stats 


See what each connection or thread in a 
SQL Server instance Is waiting on 


| two previous articles—“Getting to Know Wait 
n Stats” (InstantDoc ID 96746) and “Dissecting 
SQL Server’s Top Waits” (InstantDoc ID 98112)—I 
discussed SQL Server 2005 waits and showed you 
some of the more common ones. Those articles pro- 
vided a mostly high-level view, aggregating waits up to 
the instance level by individual wait type. The articles 
provided good advice for determining what SQL 
Server is waiting on most, but sometimes you need 
a deeper view of what’s going on. For example, from 
a high-level peek at the wait stats, you might see that 
youre waiting on lots of disk-to-memory transfers— 
based on large amounts of PAGEIOLATCH_SH 
waits shown—but you wouldn’t know who or what is 
mainly responsible for these transfers. 

Wouldn’t it be nice to see what each connection 
or even thread in the SQL Server instance is waiting 
on? Actually, this ability has always been available, 
starting with SQL Server 7.0 and the sysprocesses 
system pseudo table. I say “pseudo table” because 
there were a number of system tables that weren’t even 
real tables. They were simply derived from memory 
and presented to the user as a table when queried. 
Over the years, with each new release, more and 
more of these “tables” have been added, especially 
with all the dynamic management views (DMVs) and 
dynamic management functions (DMFs) we have in 
the SQL Server 2008 era. 

Querying the sysprocesses table was the method 
of choice if you were looking for performance-related 
problems before SQL Server 2005. But even though 
sysprocesses is still around in the form of a compat- 
ibility view (sys.syspro- 
cesses), the better choice 
now is a set of DMVs or 


SQL Server version. In contrast, the compatibility 
views might be depreciated or at least not kept current 
with all the new functionality in future versions. I'll 
touch on only some of these here, so I recommend 
that you spend some time getting familiar with them 
if you aren’t already. 


Strategic DMVs 

So, what are these objects, anyway? The main object 
of interest for my purpose is the sys.dm_os_waiting_ 
tasks DMV, which tells me which connections (or 
threads, if there are multiple threads per connection) 
are waiting and what they’re waiting on. However, 
although that DMV tells me things such as the wait 
type and how long it has been waiting, I need to join 
it with another DMV called sys.dm_exec_connections 
to get the handle that points to the actual code that’s 
being executed. And from there, I need to invoke 
or technically CROSS APPLY the DMF called sys 
.dm_exec_sql_text to get the readable version of the 
code or stored procedure that’s being executed at 
that time. 

That sounds complicated, so perhaps the code 
snippet in Listing 1 will help. Note that I’ve added a 
WHERE clause to limit the sessions shown to only 
non-system sessions and also to ensure that there’s 
no actual wait time greater than 0. You can adapt this 
WHERE clause as necessary, but most of the time I 
would mainly be concerned with direct user-related 
activities and only those activities that are accumu- 
lating waits. 

In Figure 1, you can see the waits that resulted 


LISTING |: Discovering Waits 


SELECT tx.[text] AS [Executing SQL], wt.session_id, wt.wait_duration_ms, wt.wait_type, 


wt.resource_address, wt.blocking_session_id, wt.resource_ description 


DMFs that together give 
even more information— 
and are guaranteed to be 
updated with each new 


CROSS APPLY 


FROM sys.dm_os_waiting_tasks AS wt INNER JOIN sys.dm_exec_connections AS ec 
ON wt.session_id = ec.session_id 


(SELECT * FROM sys.dm_exec_sq]l_text(ec.most_recent_sql_handle)) AS tx 
WHERE wt.session_id > 5@ AND wt.wait_duration_ms > ø 
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Executing SQL session_id wait_duration_ms wait_type 


SELECT * FROM dbo.OrderHeader... 54 4 


_ PREEMPTIVE_OS_WAITFORSINGLEOBJECT 


Figure | 


Pre-emptive wait 
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DIVING DEEPER INTO WAIT STATS 


select 


Figure 2 


Blocked query 


Executing SQL 


op 10 * from OrderDetai 


53 6066 LCK_M_S 


session_id wait_duration_ms wait_type resource_address blocking_session_id resource_description 
0x0534F360 55 


keylock hobtid=720575 


Blocked_Session_ID 


Figure 3 
Blocker 


Blocked_SQL 
select top 10 * from OrderDetail LCK_M_S 55 


Blocked_Resource Blocking_Session_ID Blocking_SQL 


BEGIN TRAN update OrderDetail| 


LISTING 2: Discovering Blocking 


SELECT 
Blocked.Session_ID AS Blocked_Session_ID 
, Blocked_SQL.text AS Blocked_SQL 
, Waits.wait_type AS Blocked_Resource 
, Blocking.Session_ID AS Blocking_Session_ID 
, Blocking_SQL.text AS Blocking _SQL 
, GETDATE() 


FROM sys.dm_exec_connections AS Blocking INNER JOIN sys.dm_exec_requests AS Blocked 
ON Blocked.Blocking_Session_ID = Blocking.Session_ID 


CROSS APPLY 
¢ 


SELECT * FROM sys.dm_exec_sq]l_text(Blocking.most_recent_sql_handle) 


) AS Blocking_SQL 
CROSS APPLY 
¢ 


SELECT * FROM sys.dm_exec_sq]l_text(Blocked.sql_handle) 


) AS Blocked_SQL 
INNER JOIN sys.dm_os_waiting_tasks AS waits 
ON waits.Session_ID = Blocked.Session_ID 


when I ran a simple query that selected all the rows 
in a large table called OrderHeader, using my sample 
query. I ran this code on my laptop, which had lots 
of other things going on, so I happened to spot this 
query waiting on one of the new Preemptive wait 
types in SQL Server 2008 (which I'll discuss in a 
future article). Remember that the focus of this article 
is to show how you can drill down into the waits and 
see which statements were 
causing the waits. In this 


Only at this level 
can you begin to 
troubleshoot 

the “who” and the 


case, it’s clear not only 
which connection (SPID 
54) caused the waits but 
also which statement was 
used. 

This sample scenario 
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“why” of the was based on waiting 


on an external resource, 

blocki ng. but you can also wait 

on locks taken by other 

users (known simply as blocking). Both will show up 

in the high-level wait stats, but only at this level can 

you begin to troubleshoot the “who” and the “why” 

of the blocking. Figure 2 shows another example of 

a query that is waiting, but this time it’s being blocked 
by an update that hasn’t yet been committed. 

You can see that the blocking_session_ 

id column is populated, and you can use that 

information along with yet another DMV— 


sys.dm_exec_requests—to get a snapshot view into 
both the blocked session and the blocker, as you see 
in Figure 3, using the query in Listing 2. 


Streamlined 
Keep in mind that there are many good code exam- 
ples that can provide much more detail for all aspects 
of this topic, if you need them. But, at the core, they 
almost universally use these DMVs similarly to what 
I’ve shown you here. These DMYs will satisfy most 
casual probes, while resulting in very little overhead 
on your system. Knowing how the DMVs work and 
what they can provide will get you moving in the right 
direction when you need to drill down a little further. 
SQL] 
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icrosoft’s SharePoint technologies are 

among the fastest growing products 

released by the company. As organiza- 
tions around the world adopt SharePoint as the 
platform for intranet, extranet, and Internet sites, 
DBAs and IT pros are stepping in as SharePoint 
administrators for their organizations. This article is 
an introduction to the security considerations asso- 
ciated with the installation and administration of 
Windows SharePoint Services (WSS) and Microsoft 
Office SharePoint Server (MOSS) 2007. 


Server Farm-Level Accounts 
When adding SharePoint to your environment, it’s 
important to identify and document the service accounts 
you'll be using to install SharePoint, to connect Share- 
Point to the various applications and services it needs, 
and to provide SharePoint with access to external 
systems and repositories. (For more information about 
SharePoint's server farm tiers, see the web-exclusive 
article "Understanding SharePoint's Architecture," 
www.sqlmag.com, InstantDoc ID 102148.) Microsoft 
has identified four categories of SharePoint security 
accounts: server farm—level accounts, Shared Service 
Provider (SSP) accounts, WSS search accounts, and 
application pool accounts. 

Server farm-—level service accounts are used to set 
up SQL Server and SharePoint and to provide an 
identity to SharePoint’s administrative tools such as 
Central Administration and the WSS Timer service. 
These accounts are typically domain user accounts that 
belong to a variety of AD groups, including the local 
Administrators group. In SQL Server, these service 
accounts can operate with the least privilege admin- 
istration rights by granting db_owner privileges inside 
of the various SharePoint databases and securityadmin 
and dbcreator fixed server roles. Although these seem to 
be powerful rights, they’re the least privileges required 
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SharePoi 
Secu 


ri 


“DBAs 


Configuring Active Directory service accounts and 
SharePoint's internal security layers 


by SharePoint. SharePoint takes over the creation and 
ownership of its back-end databases on SQL Server. 

The SQL Server service account is required during 
the installation of SQL Server and isn’t utilized by 
SharePoint. SQL Server will prompt for this Local 
System or domain user account. A domain account 
is preferred because only domain user accounts can 
make remote procedure calls, back up to network 
drives, participate in replication, use the SQL Mail fea- 
tures, or any other activity requiring network resource 
access controlled by Windows domain authentication. 
This service account is used to run some of the ten 
SQL Server services, including MSSQLServer and SQL 
Server Agent, and potentially MSSQLServerOLAP- 
Service, ReportServer, and SQLBrowser. This service 
account should be kept completely separate from Share- 
Point. The different SQL Server component services 
might run under different service accounts to provide 
more granular security. These service accounts are the 
basis for security in SQL Server. The rights assigned to 
these accounts and the context associated with each one 
define which resources are available on your server. If 
your other layers of security are penetrated by a mali- 
cious program or attacker, the different limits on each 
service account help to limit the attack surface. 

The SharePoint Setup user account is the account 
used by IT staff to log into the server and to run the 
setup application for the SharePoint installation pro- 
cess. It must be a domain user account and a member 
of the Administrators group on the local server(s). 
In addition, this account must be made a member 
of the securityadmin and dbcreator fixed server roles 
inside of SQL Server and must be granted db_owner 
privileges for any database that might be affected by 
SharePoint or the SharePoint command-line 
utilities. By default, these fixed server roles 
and database privileges are granted during 
the installation of the SharePoint software if 
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Log on as: 


Windows SharePoint Services Timer Properties (Local Co 


General Log On | Recovery | Dependencies | 


™ Local System account 


T Allow service to interact with desktop 
@ This account! [testdomain\moss_configadmin Browse... | 


Hardware Profile 


Profile 1 


Figure | 


WSS Timer service Log 
On identity 
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Enable 


Disable | 


you choose the simple (SQL Server Express) installa- 
tion. However, when choosing the more complex server 
farm installation, the Setup user account must have the 
required access to the database before beginning the 
install. The Setup user account also runs the Stsadm 
and PSConfig command-line utilities. The Microsoft 
article “How to change service account and service 
account passwords in SharePoint Server 2007 and 
Windows SharePoint Services 3.0” (support.microsoft 
.com/kb/934838) outlines the procedure to change the 
passwords for MOSS 2007 service accounts using the 
Stsadm UpdateFarmCredentials command. 

The final server farm-—level service account is the 
Server farm account or the Database Access account. 
This account acts as the IIS application pool identity 
for the Central Administration site and is the Log On 
identity of the WSS Timer service, as Figure | shows. 
This account must be a domain user account and is 
automatically granted the appropriate dbcreator, secu- 
rityadmin, and db_owner permissions to the back-end 
SQL Server during the SharePoint installation process. 
Furthermore, this account is used by SharePoint to 
access AD when MOSS’s People Picker is used to find 
and validate people in the portal. The People Picker 
field can be found when manipulating site collection 
owners, when adding users to People and Groups, and 
when using the People column type in lists. In addition, 
if the server is configured to receive email and automati- 
cally create AD distribution groups and contacts, this 
account is used to configure AD. 


SSP Accounts 
SSP accounts are associated with MOSS Standard 
Edition and Enterprise Edition. These accounts are 


used to control MOSS search services, timer services, 
content crawling, AD or LDAP profile import, ECS, 
and IIS Application Pool accounts. These SSP service 
accounts operate in a similar set of least privilege condi- 
tions as the server farm-level accounts. Typically, these 
accounts are domain user accounts and not members 
of the local Administrators group. In SQL Server, these 
accounts are granted db_owner fixed database roles for 
some databases and simple read and write privileges for 
others. Once the account has been designated in the SSP 
administrative pages, SharePoint will automatically con- 
figure the correct levels of access in SQL Server, using its 
higher-privilege account as a management connection. 

Like other SharePoint components, SSPs require 
database ownership privileges to manage objects and 
security within those databases. Most DBAs are used 
to having to configure database security separately from 
application security. However, SharePoint and its com- 
ponents smoothly integrate this process, bypassing the 
need for separate database security management. The 
most prominent service account at the SSP level is the 
SSP application pool account. This account provides 
process isolation for the SSP web application. During 
the setup of the SSP, the required permissions for this 
domain user account are granted automatically. This 
service account is granted db_owner permission for 
the SSP content database, read and write access to the 
content databases of the web applications that utilize 
the SSP, read access to the configuration database, and 
read access to the Central Administration database. 

The SSP service account is used by the SSP web 
services for inter-server communications and the 
SSP Timer service. This service account is also the 
application pool identity. No manual configuration is 
necessary for this account because it’s automatically 
granted the same permissions as the SSP application 
pool account. Although this account needs to be a 
domain user account, it shouldn’t be a member of the 
Administrators group on any computer in the server 
farm. Although Microsoft’s documentation refers to 
the SSP service account and the SSP application pool 
account as different service accounts, it’s actually a best 
practice to make these the same domain user. 

The SharePoint search services can actually utilize 
several distinct service accounts. The Office SharePoint 
Server Search service account is used by the Search 
service. This account is unique in that there’s only one 
instance of this search service in a farm, and it’s shared 
by all of the SSPs that exist in the network. This account 
must be a domain user account but not a member of the 
Farm Administrators group. Database access is granted 
automatically. The default content access account is 
the default crawl account used by the Search service to 
crawl content (unless a specific authentication method is 
required by a crawl rule for a specific URL). The default 
content access account is a domain user account that 
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must have read access to all external or secure content 
sources that will be crawled by this account. This means 
that any site that’s not a member of the server farm must 
include this account as a user and grant this account full 
read access. Database access is granted automatically. 
Although there’s a default content access account, it’s 
possible that some content sources might require specific 
authentication using a Content Access account, which is 
specified in the SSP’s crawl rules. 

The other two major services provided by an SSP 
are the Profile import service and ECS. Each of these 
services has a unique service account. The Profile import 
default access account is used to import profile data from 
AD, LDAP directories, Business Data Catalog (BDC) 
applications, or other directory sources. This account 
must have read access to the directory service and must 
have the Manage User Profiles personalization services 
permission. (Read “Manage user profiles and user pro- 
file properties” at technet2.microsoft.com/Office/en-us/ 
library/9df61a21-96ca-4b3c-b912-e472c686e8871033 
-mspx?mfr=true to learn more about this permission.) 
The Profile import default access account is automati- 
cally granted this personalization permission unless the 
default content access account (Farm Search Service 
Account) isn’t a member of the Farm Administrators 
group. If BDC imports are used, the service account 
must have view permissions on the entities used in the 
BDC connection. The other service account is the Excel 
Services unattended service account. This account is 
used to connect to external data sources that don’t use 
Kerberos or Integrated Windows Authentication in 
the connection string. Although the target data source 
doesn’t use Windows for authentication, the service 
account must be a domain user account. 


WSS Search Accounts 

WSS Search accounts are granted high-level rights 
on the server and in SharePoint because they require 
access to all of the available content for search indexing, 
regardless of applied permissions. The WSS search 
account is used to control the WSS Search service and 
to access content in SharePoint sites. This account must 
be a domain user account that’s not a member of the 
Farm Administrators group. Database permissions, 
including read access to the configuration database and 
the SharePoint_Admin content database, and member- 
ship in the db_owner role for the SharePoint Services 
search database are all granted automatically. As with 
the Office SharePoint Server Search service account, 
there’s only one WSS search account in a farm. 

The WSS Search Content Access account is used 
to crawl content across the various SharePoint sites, 
which might require distinct credentials. Typically, these 
accounts are used when crawl rules specifying login cre- 
dentials are created. The requirements for this account 
are the same as the requirements for the WSS Search 
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service account. This service account is automatically 
added to the web application’s full-read policy for the 
farm if MOSS isn’t installed. If MOSS is installed, 
the WSS Search service account is used to index Help 
content for searching inside of the Help popup, and the 
Content Access account isn’t used. 


IIS Application Pool Accounts 
IIS application pools are used to provide process 
isolation in IIS. Process isolation prevents an error in 
one application from affecting applications running 
in a different process. The HS application pool service 
accounts are typically granted a reduced set of server 
privileges and are used as the actual database owner 
of the SharePoint content database for the appropriate 
sites. The primary use for this service account is to 
provide SharePoint with read and write access to the 
content databases associated with the web applications 
that reside in that particular application pool. Joel 
Oleson, formerly of Microsoft, recommended in his 
blog (blogs:msdn.com/joelo/archive/2006/07/19/671789 
.aspx) that a single Share- 

Point server not attempt 


to use more than four Most DBAs are used to 


distinct application pools 


(with 10 being the max- having to configure 


imum) to prevent potential 


performance problems, Gatabase security separately 


(This recommendation 
doesn’t include the Cen- 


from application security. 


tral Administration or SSP_ However, SharePoint and its 


application pools because 


these applications are used COM ponents smooth ly 


infrequently.) Each Share- 


Point application pool uses integrate this Process, 


an average of 800MB of 


RAM, severely limiting bypassing the need for 


32-bit systems. Following 
these recommendations, 
you'll likely want to group management. 
your SharePoint web appli- 

cations under a single application pool. 


Configuring Security in 
SharePoint 

Securing SharePoint begins with the definition and 
assignment of AD service accounts but doesn’t end 
there. SharePoint also has many different internal 
security layers, including the server or server farm level, 
the shared services level, and the site level. (For more 
information about SharePoint’s logical layers, see “Log- 
ical architecture components” at technet2.microsoft 
.com/Office/en-us/library/aaed3a0 1 -f4dc-4353-abda- 
Obeced2080b61033.mspx?mfr=true.) Administration 
actions can occur on any of these layers, so SharePoint 
provides a security infrastructure to help control access 


separate database security 
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to these locations. 

Security in SharePoint is handled by the assignment 
of rights and roles. Foremost among these rights are the 
administrative permissions, which provide control over 
SharePoint’s functionality. The individuals performing 
administrative actions at any of these levels might or 
might not be the same individual. Users and groups can 
be granted administrative permissions on the following 
layers in SharePoint. 

Farm level. Members of the Administrators group 
on the local server have the ability to change any aspect 
of SharePoint, including the fundamental building 
blocks upon which SharePoint relies, SQL Server, 
and IIS. Despite this enormous level of power, server 
administrators aren’t automatically granted access to 
the content stored in SharePoint sites. 

Farm administrators are identified in Central 
Administration. These users have permission to per- 
form any administrative task in Central Administra- 
tion. Members of this group can also use the Stsadm 
command-line tool to perform tasks such as scheduled 
solution deployments and unattended SharePoint 
backups. Similar to members of the local server’s 
Administrators group, farm administrators don’t auto- 
matically have access to 
site content. 

Shared services level. 
SSP administrators are 


of security are penetrated defined within the SSP 
by a malicious program itself and can control which 


services are included in the 


or attacker, the different ssp, suchas Excel Services 


and User Profile Imports. 


limits on each service (or more information, 
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account help to limit see “Shared Service Eh 


viders and Delegation’ 


the attack surface. at blogsmsdn.com/joelo/ 


archive/2006/10/03/Shared- 
Service-Providers-_2800_SSPs_2900_-What-Are- 
They_2E002E002E00_-How-to_2700_s-On-Delegation 
.aspx.) Service administrators at the shared services level 
are able to configure settings for a specific service within 
an SSP. Frequently, users are granted service administra- 
tion permissions so that they have the ability to tune 
SharePoint’s search functionality for the SSP by defining 
authoritative pages and search keywords. 

Site level. Site collection administrators are defined 
in Central Administration for each site collection in 
every web application and are given Full Control of 
every site within that site collection. Site collection 
administrators don’t require additional permissions at 
the site level to have access to that site’s content because 
their level of control is granted centrally. Because site 
collection administrators might be defined in Central 
Administration, farm administrators or local server 
administrators could add themselves to this group to 


access otherwise unavailable content and perform site- 
level administration activities. 

The Home Owners group is automatically created 
on every SharePoint site in the People and Groups area. 
Members of the Home Owners group are site owners 
and are granted Full Control on that site. Site owners 
can be distinct for every site in a site collection or can be 
inherited from the top-level site. These users have access 
to the site settings menu and can modify the settings of 
any list or library on the site. In addition, custom groups 
can be created and assigned Full Control permissions, 
which equal the default permissions of the Home 
Owners group. The main difference between the Home 
Owners group and custom groups with Full Control 
and site collection administrators is that subsites can 
opt to specify unique site permissions that exclude 
the parent site’s Home Owners group. Site collection 
administrators can exercise Full Control through every 
site in the collection. 


SharePoint’s Policy for Web 
Application Feature 

Although it’s not truly an administrative level, Share- 
Point’s Policy for Web Application feature deserves 
a mention here. Defined in Central Administration’s 
Application Management tab, members of the Read 
policy group or Deny policy group can be granted 
read access to the entire farm or denied read access to 
the entire farm. These policy settings override site-level 
permissions. A scenario in which you might use these 
settings is to provide an auditor access to the entire site 
for a short period of time. You wouldn’t want to navigate 
to each location the auditor needed but rather grant the 
auditor access for the time they'll be on site and then 
remove the auditor’s rights once they leave. Another 
reason to add a user to a policy group might be if 
someone leaves your company and you need to remove 
their access rights immediately. In that case, you can 
deny the user’s rights in the policy of the web application 
without going to each location to remove them. Doing 
so prevents the user from using the site while giving you 
time to correctly remove the user from the site. 


A Secure SharePoint 
Environment 
DBAs and IT pros are increasingly finding themselves 
associated with SharePoint projects. SharePoint is a 
complex system that requires some serious study to 
follow Microsoft’s best practices. There are twelve dif- 
ferent types of service accounts available on the network, 
three distinct levels of security configuration in Share- 
Point itself, and a host of data connection capabilities for 
extensibility. Gaining an understanding of the various 
surfaces for security configuration is an important 
aspect of an effective SharePoint installation. [SQL 
InstantDoc ID 101724 
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PIVOT query uses a CTE 


he PIVOT operator, which was introduced in 
i SQL Server 2005, lets you create results that 
are pivoted, essentially using the data from 
one of the columns as column headers. For example, 
suppose you want to create a report that breaks down 
sales by year and month so that you can compare 
sales months for different years. Using the 2005 or 
2008 version of the AdventureWorks database, you 
can create a query summarizing the data with the 
code in Listing 1. Table 1 shows an excerpt from the 
results. As you can see, looking for trends by month 
isn’t easy. 

Table 2, page 32, shows the results as you would 
like to see them. In Table 2, the data is summarized 
and displayed so that the months can be easily com- 
pared from year to year. I’ll explain how to write 
queries that use the PIVOT operator to produce the 
results shown in Table 2. 

Note that I won’t be using the PIVOT syntax 
shown in SQL Server Books Online (BOL) because 


LISTING I: Query that Summarizes 
Sales by Year and Month 


SELECT SUM(TotalDue) TotalDue, YEAR(OrderDate) AS YearOrdered, 


MONTH(OrderDate) AS MonthOrdered 
FROM Sales.SalesOrderHeader 
GROUP BY YEAR(OrderDate) ,MONTH(OrderDate) 
ORDER BY YEAR(OrderDate) ,MONTH(OrderDate) 


TABLE |: Partial Results 
from the Query in Listing | 


TotalDue YearOrdered MonthOrdered 
1172359.4289 2001 7 
2605514.9809 2001 8 
2073058.5385 2001 9 
3781879.0708 2002 i 
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that syntax can be difficult to understand at first 
glance. The syntax in BOL uses a derived table as the 
basis for the PIVOT query. What BOL doesn’t point 
out is that you can use a common table expression 
(CTE) instead. Listing 2, page 32, shows the syntax 
for a PIVOT query that uses a CTE. As you can see, 
there are two main parts: a base query (callout A) and 
a PIVOT expression (callout B). This syntax might 
look intimidating, but I'll guide you through it step 
by step. 


Step I. Write the Base Query 
Writing the base query takes a bit of planning. It’s 
important that this query include only the columns 
that will be needed in the final results. Any columns 
not pivoted or aggregated will end up as groupings, 
so any unnecessary columns will cause 
extraneous grouping levels and unexpected 
results, 

In this case, the Sales.SalesOrderHeader 
table has more columns than 
what is needed in the results. All you 
need is the OrderYear, OrderMonth, 
and TotalDue columns. You're going to 
group by the OrderYear column, pivot 
by the OrderMonth column, and aggre- 
gate the TotalDue column, so the base 
query is 


SELECT TotalDue, YEAR(OrderDate) 
AS OrderYear, 
DATENAME (MONTH, OrderDate) 
AS MonthName 

FROM Sales.SalesOrderHeader 


The code in Listing 3, page 32, shows this base query 
as a CTE. You should execute the CTE to make sure 
that the necessary columns are present and that there 
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R i CREATE PIVOTED TABLES IN 3 STEPS 


TABLE 2: The Pivoted Results 


Year January February March eA November December 
2001 NULL NULL NULL ee 3690018.6652 3097637.3384 
2002 1605782.1915  3130823.0378  2643081.0798 4427598.0006  3545522.7380 
2003 2233575.1127 — 3705635.4979 + 2611621125961 (E 5961182.6761  6582833.0438 
2004 3691013.2227  5207182.5122 52727868106... NULL NULL 


LISTING 2: Syntax for a PIVOT Query 
that Uses a CTE 


@)-- The base query 
WITH <cte name> AS ( 
<Select query that produces the data>) 
SELECT <non-pivoted column>, 
[<first pivoted column>] AS <column name>, 
[<second pivoted column>] AS <column name>, 


[<last pivoted column>] AS <column name> 


you want displayed as column headers. In other 
words, which values that are currently displayed 
vertically do you want to display horizontally? In this 
example, the pivoted column is OrderMonth, so the 
code looks like 


PIVOT(SUM(TotalDue) FOR OrderMonth 


FROM <cte name> 


@)-- The PIVOT expression 
C 


<aggregation function>(<column being aggregated>) 


FOR 


[<column that contains the values that will become column headers>] 
IN ( [<first pivoted column>], [<second pivoted column>], 
. [klast pivoted column>]) 
) AS <alias for the pivoted table> 


<optional ORDER BY clause> 
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LISTING 3: The Base Query 
asa CTE 
WITH BaseQuery AS( 
SELECT TotalDue, YEAR(OrderDate) AS OrderYear, 
DATENAME (MONTH, OrderDate) AS OrderMonth 
FROM Sales.SalesOrderHeader 


) 
SELECT * FROM BaseQuery 


aren't any extraneous columns. (The results will not 
be aggregated or pivoted at this point.) 


Step 2. Create the PIVOT 
Expression 

The next step is to create the PIVOT expression. The 
first element in the PIVOT expression is an aggre- 
gate function. Often this function will be SUM. The 
parameter of the aggregate function is the name of 
the column to be aggregated. The function’s results will 
show up under the pivoted columns. In this example, 
you want to compare the sum of the TotalDue values 
by month, so TotalDue is the aggregated column. The 
PIVOT expression with the aggregate function is then 


PIVOT (SUM(TotalDue) 


After the aggregate function, you must type the 
keyword FOR followed by the name of the pivoted 
column. To determine the pivoted column, you need 
to figure out which column contains the values that 


The pivoted column’s name is followed by an IN 
list that’s similar to one found in a WHERE clause. 
This IN list serves two purposes. First, it restricts 
the rows that are pivoted. Second, it supplies the 
pivoted column names. If the values that will end up 
as column names don’t follow the rules for regular 
identifiers, they must be surrounded by brackets ([ ]). 
For instance, if this example used month numbers 
instead of month names, you’d need to place each 
month number inside brackets. 

One limitation of PIVOT queries is that they 
aren’t dynamic, so all the column headers need to be 
hard-coded. If the pivoted column has values that 
vary over time, this part of the expression must be 
modified each time the data in the pivoted column 
changes. In this example, the column headers aren’t 
likely to change because they’re the months of the 
year. However, if a query compared sales by territo- 
ries or sales by salespeople, the query would probably 
have to be modified frequently. Another option would 
be to write a stored procedure using dynamic SQL to 
determine the column headings. If you're interested in 
learning how to create a dynamic pivot query, see Itzik 
Ben-Gan’s web-exclusive article “Dynamic Pivoting” 
(InstantDoc ID 43140) or Inside Microsoft SQL 
Server 2005 T-SQL_Programming (Microsoft Press, 
2006) by Ben-Gan, Dejan Sarka, and Roger Wolter. 

The IN list needs to be enclosed in parentheses so 
the PIVOT expression now looks like 


PIVOT(SUM(TotalDue) FOR OrderMonth 
IN(January, February ,March, April ,May, 
July, July,August, September ,October, 
November , December) 


Next, you must give an alias to the PIVOT expression. 
An alias is required because the PIVOT function’s 
results are treated as a table. The alias goes after the 
final closing parenthesis, so in this case, the PIVOT 
expression looks like 
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CREATE PIVOTED TABLES IN 3 STEPS A i 


PIVOT(SUM(TotalDue) FOR OrderMonth 
IN(January, February ,March, April ,May, 
July, July,August, September ,October, 
November ,December)) AS PVT 


The PIVOT expression is now complete. As callout B 
in Listing 2 shows, it goes after the FROM clause and 
before the ORDER BY clause if there is one. 


Step 3. Add the Column Names 
to the SELECT List 

At this point, you need to add the column names to 
the SELECT list in the main query. In this case, the 
columns are Order Year and the pivoted columns. You 
should not list the aggregated column, TotalDue, so 
the SELECT list looks like 


SELECT OrderYear, January, February, 
March,April,May,July,July,August, 
September , October, November , December 


The pivoted columns will display in the order that 
they’re listed in the SELECT clause. Listing 4 shows 
the completed PIVOT query, which created the piv- 
oted results in Table 2. 


Variations 

There are many ways the PIVOT query can vary. 
Take, for example, the PIVOT query in Listing 5. This 
query varies from the one in Listing 4 two ways: 

e It uses aliased column names. In this query, the 
numeric month of the order date is used in the 
base query. The pivoted column names are aliased 
in the SELECT list so that the column headers are 
month names rather than month numbers. To save 
typing, it’s advantageous to produce the desired 
column headings in the base query, which in this 
case is the CTE. 

It uses an ORDER BY clause to order the 
returned data. The ORDER BY clause needs to 
go after the PIVOT expression. The only columns 
allowed in the ORDER BY clause are those 

that actually show up as columns in the results. 


LEARNING PATH 


SQL SERVER MAGAZINE RESOURCES 
Itzik Ben-Gan on PIVOT: 


“Pivot (or Unpivot) Your Data,” InstantDoc ID 
42901 


“Dynamic Pivoting,” InstantDoc ID 43140 


“Enhancing PIVOT for Crosstab Queries,” 
InstantDoc ID 93907 


“PIVOT on Steroids,” InstantDoc ID 94268 
“Views and Dynamic PIVOT,” InstantDoc ID 94856 
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LISTING 4: The Completed 
PIVOT Query 


WITH BaseQuery AS( 
SELECT TotalDue, YEAR(OrderDate) AS OrderYear, 
DATENAME (MONTH, OrderDate) AS OrderMonth 
FROM Sales.SalesOrderHeader 


SELECT OrderYear, January, February ,March,April,May, 
July, July,August, September , October , November , December 

FROM BaseQuery 

PIVOT(SUM(TotalDue) FOR OrderMonth IN (January,February,March, 
April, May, July, July,August, September, October ,November, 


December)) AS PVT 


LISTING 5: Another Example of a 


PIVOT Query 


WITH BaseQuery AS( 


SELECT TotalDue, YEAR(OrderDate) AS OrderYear, 
MONTH(OrderDate) AS OrderMonth 
FROM Sales.SalesOrderHeader 


) 


SELECT OrderYear,[1] AS [January],[2] AS [February], 
[3] AS [March],[4] AS [April],[5] AS [May],[6] AS [July], 
[7] AS [July], [8] AS [August],[9] AS [September], 
[18] AS [October], [11] AS [November], [12] AS [December] 


FROM BaseQuery 


PIVOT(SUM(TotalDue) FOR OrderMonth IN ([1], [2], [3], [4], [5], 
[6], £7], [8], IRASE 


ORDER BY January DESC 


You can include the columns used for grouping 
and those specified in the IN list. For example, 
this query uses ORDER BY January DESC to 
display the results in order of highest to lowest 
sales in January. If you want to display the year 
in descending order, you would put ORDER BY 
Order Year DESC after the PIVOT expression. 


A Useful Tool 
Although the PIVOT 
operator might look 
intimidating, writing a 
PIVOT query isn’t that 
difficult if you take it 
step by step. The PIVOT 
operator is perfect for 
pivoting results when 
the pivoted columns 
aren't likely to change. 
Although its disap- 
pointing that the PIVOT 
operator isnt dynamic, 
the PIVOT operator is 
still a very useful tool. (A 
dynamic PIVOT oper- 
ator is on my wish list for 
the next release of SQL 
Server since the feature 
didn’t make it into SQL 
Server 2008.) [SQL] 
InstantDoc ID 101684 


The PIVOT operator 

is perfect for pivoting 
results when the pivoted 
columns aren't likely to 
change. Although it’s 
disappointing that the 
PIVOT operator isn’t 
dynamic, the PIVOT 
operator is still a very 
useful tool. 
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Register at 
sqlmag.com/go/ 
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GetTheMostFromBl 


n top OT tne 
Join industry gurus Derek Comingare and Barry 
Ralston on July 23, 2009 for 3 lessons on leveraging BI 
to strengthen your company’s overall health, plus live 
Q&A sessions—all on your own computer! Learn how to 
reap additional business value from your organization’s 
existing data assets and increase the visibility of key 
organizational metrics. 


Derek Comingore is a Senior Architect with ComFrame 

Software, a SQL Server MVP, a certified professional in 

several Microsoft technologies including SQL Server BI, 

a speaker at several local and national SQL Server user 
groups, and is a published author on SQL Server topics. 


Barry Ralston is currently Chief Architect for Business 
Intelligence with Birmingham-based ComFrame 
Software. With a career in Business Intelligence 
spanning l4 years, he has previously been an invited 
speaker to the Oracle Development Tools User Group, International 
Oracle User Group - Americas and JavaOne. 


Learn more about the speakers, sessions, 
and how to reserve your seat at: 
sqimag.co 


E E 


WindowsIT Pro 


SQL defrag manager 2.5 


pe’ managing index density and fragmenta- 
tion levels can be a very time-consuming task, 
so I was happy to take a look at Idera’s SQL defrag 
manager 2.5 to see what it had to offer. Overall, I was 
very impressed with what I found. 

At a high level, SQL defrag manager 2.5 is a GUI 
that wraps existing T-SQL functionality for index 
analysis, defragmentation, and optimization. It lets you 
quickly analyze and manually defragment indexes with 
just a few clicks of the mouse. Moreover, SQL defrag 
manager makes it pretty easy to do at-a-glance index 
management by utilizing a dashboard approach that 
can be easily configured to highlight indexes, tables, or 
databases with problems. 

When analyzing SQL defrag manager 2.5 and 
putting it through the paces, I was impressed by how 
smoothly the installation went, and how easy it was 
to get everything set up. There’s no need to install any 
agents on your managed servers. Instead, the installa- 
tion process creates a new database that acts as a repos- 
itory for policies, as well as scheduling and historical 
data that can be used with the UI or admin console 
(which can be installed on any client machine). 

This tool is intuitive and easy to understand. The 
admin console has a Microsoft Office ribbon look and 
feel that makes navigating to different options and fea- 
tures a breeze. More importantly, SQL defrag manager 
makes manual analysis and correction of index frag- 
mentation a snap using the Fragmentation Explorer, 
which displays object hierarchies, index metrics, and 
fragmentation details. 

Where SQL defrag manager really shines, however, 
is by enabling automated index defragmentation. By 
using the admin console in conjunction with the spe- 
cialized Defragmentation Service and the repository 
(or custom database) created during installation, you 
can easily create policies (or jobs) that can be scheduled 
as needed. Policies are easy and intuitive to configure, 
and you can specify thresholds for fragmentation levels 
and scan densities (as well as filters for what kinds of 
conditions to ignore) that indicate when you need to 
take action. Policies respond to fragmented indexes by 
rebuilding or reorganizing them. 

Flexibility and control are rampant throughout the 
policy creation process. For example, policies can target 
entire servers, all the indexes of a given table, or selected 
indexes on one server and a handful of indexes on 
another. You just have to select which objects you want 
to be included from a tree view of registered servers and 
specify other criteria as needed via the appropriate tabs 
used for policy definitions. 
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One of my favorite features enables policies to take 
no action; instead, problematic indexes are dropped into 
a to-do list where you can manually and individually 
analyze and deal with them. Another place where SQL 
defrag manager shows considerable maturity and real- 
world clout is when it comes to defining Resource Checks 
that can be evaluated before each policy is executed, as 
Web Figure | shows (www.sqlmag.com, InstantDoc 
ID 102030). This feature lets you craft moderately 
aggressive defragmentation policies but still ensure that 
if an extraction, transformation, and loading (ETL) 
job runs a bit late, policy execution won't bring the 
server to its knees by blindly starting when scheduled. 

Sadly, there’s no way to fire SQL defrag manager 
policies off manually. So, if you’ve got a policy defined 
for a few problematic tables that you need to run on a 
semi-regular basis, you either have to manually defrag 
those tables or temporarily tweak the policy’s 
schedule. This is a big limitation in my mind 
because it prevents you from using SQL 
defrag manager’s more powerful capabili- 
ties in an ad-hoc fashion. Another negative 
factor that I encountered was that DROPed tables kept 
showing up in the UI, and trying to force re-scans of the 
dropped tables resulted in errors. 

I would have given SQL defrag manager almost five 
stars if it weren't for its pricing. As amazing as this tool 
is, at $1,195 per managed server, it’s overpriced. Although 
Idera offers discounts for volume licenses, it doesn’t make 
that blatantly obvious on its website. Still, this solution 
has some great features and can help cut costs in terms of 
administrative effort, so it’s definitely worth an evaluation. 

[SQL] 
InstantDoc ID 102030 


Michael K. 
Campbell 


(mike @ sqlservervideos.com) is a consul- 
tant with years of SQL Server DBA and 


development experience. He spends most 


of his time engaged in consulting, technical 
evangelism, and creating free online SQL 


Server videos. 


ORE on the WEB 


See the web figure at 
InstantDoc ID 102030. 


SQL DEFRAG MANAGER 2.5 


Pros: Well-designed UI makes manual analysis and defragmentation 
a snap; automated analysis (via policies) is easy to configure and very versatile 
and flexible in terms of scope and capability; can save lots of time; excellent 
for both granular and exhaustive multi-server management 


Cons: Policies can’t be manually executed; doesn’t react well to DROPed or 
renamed tables and indexes; ideal for environments with lots of SQL Server 
instances, but Idera isn’t «ery clear about bulk licensing costs on its website 


Rating: We yx ye ae vy 


Price: $1,195 per managed server 


Recommendation: This product does a fantastic job of wrapping SQL Serv- 
er’s native index defragmentation tools with an intuitive and useful UI, and its 
automation policies are easier to work with than custom scripts and jobs. 


Contact: Idera e www.idera.com è 877-464-3372 
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Bytes from the Blog 


www.sqdlmag.com/go/industrybytes 


Privacy, Surveillance, and the 
Persistence of Data 

eff Jonas, a Las Vegas resident, data expert, and 
| ae conference speaker, shared how he uses 
his insights into data to shine a spotlight on the bad 
guys—be they casino scammers or international terror- 
ists. In a keynote address at the annual gathering of the 
COMMON users group, which advocates and educates 
on behalf of Power Systems (IBM i, AS/400, iSeries, 
System i, ATX, and Linux), Jonas explained the tenets 
of Non-Obvious Relationship Awareness (NORA), 
and how he shows casinos and US government agen- 
cies how to piece large amounts of data together like a 
jigsaw puzzle to pop the miscreants. 

Starting with a few simple assertions such as “the 
data must find the data” and “the relevance must find 
the user,” Jonas, distinguished engineer and chief sci- 
entist of Entity Analytic Solutions at IBM, provided a 
high-level look at how he interrogates large data sets. 
Without context, he says, it’s impossible to analyze 
data across the silos in which it finds itself. He calls this 
state of affairs “enterprise amnesia.” How, he asked, do 
you stitch everything together—the silo-ed data, the 
structured data, and the unstructured data? He set the 
problem this way: “How do you accumulate context?” 
His answer: The arrival of each piece of data must be 
treated as a query. You can’t know until the data is 
asked, “How does this relate to what I know?” how 
valuable a piece of information can be. 

When data is treated this way, a state Jonas calls 
“persistent presence,” a seemingly innocuous puzzle 
part, such as an address change or a repeated mis- 
spelled name can give rise to an epiphany that con- 
nects other puzzle pieces or let the analyst know that 
previous assumptions have been false. 

Of course if all data is marked this means that 
not only can the bad guys’ activities be traced, so can 
everyone else’s. We're moving toward what Jonas called 
a “surveillance society.” He pointed to the ACLU clock 
on social surveillance, designed on the model of the 
Atomic Scientists’ Doomsday Clock, which says that 
we're at six minutes to midnight—the witching hour 
for total surveillance. (The Atomic Scientists’ clock is 
currently at five minutes to midnight. It would be an 
interesting exercise to unsilo the data behind these two 
clocks.) 

Jonas anticipates that piles of data will eventually 
become one collective intelligence stored in the cloud. 
As for life in 2050, he predicts that this “collective 
intelligence will evaluate what you need to know and 
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tell you.” Privacy protections will be a matter of skillful 
coding. And of course eternal vigilance. 


European BI Market Trends 
Shifting 

he European PASS Conference 2009 has come 

and gone. As with the PASS Summit 2009 that 
will take place in Seattle later this year, business intelli- 
gence (BI) was a much-talked-about topic at European 
PASS, especially with its agenda filled with BI sessions, 
including Donald Farmer’s BI-focused keynote speech 
(“Navigating Through a Turbulent Economy with 
Microsoft BT”) and a SQL Server BI-focused precon- 
ference seminar. 

With a rekindled focus on SQL Server's native BI 
tools, and all of the news coming out of Microsoft about 
its new BI platforms, you might wonder why an organi- 
zation would spend the extra money to buy a third-party 
BI solution that offers much of what SQL Server offers 
out of the box. In fact, many organizations in Europe 
have been turning to third-party BI solutions instead of 
using SQL Server's native tools for years. However, that 
market trend seems to be reversing. Seamus Quinn, the 
editor of SQL Server Magazine’s sister email newsletter, 
Windows IT Pro News EMEA Edition, spoke with 
SQL Server MVP Allan Mitchell about how the BI 
tide is changing in Europe as more and more compa- 
nies are recognizing the benefits of using SQL Server's 
native BI tools. See what Mitchell had to say about 
the BI product market in 
Europe in Quinn’s “Euro- 
pean SQL Server Confer- 
ence Catches BI Tide,” 
available by going to www 
.windowsitpro.com/emea 
and typing 101745 in the 
InstantDoc ID box. 

To find out what events 
are taking place in Europe 
and what topics are being discussed by the European 
SQL Server and IT community, I recommend sub- 
scribing to the monthly Windows IT Pro News EMEA 
Edition email newsletter at www.windowsitpro.com/ 
email. You can check out some of Quinn’s past articles 
for the email newsletter by going to www.windowsitpro 
.com/authors and finding Quinn, Seamus. 

For more information about the European PASS 
Conference 2009, visit www.european-pass-conference 
com. [SQL 
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Download Central brings you the tools to meet your most critical SQL needs. 


A one-stop hub of countless free trial downloads from leading 
industry vendors, Download Central has done all the looking. 


All you have to do is see which tool is the best fit. 
And you get to do it all for FREE! 


Score Your Solution at Download Central! 
sqlmag.com/go/downloads 


DATABASE MONITORING 

Stored Procedure Monitoring and Enhanced User Role Auditing 
Secerno announced Secerno DataWall 3.3, the latest version of its database activity monitoring and policy Got a great 
enforcement hardware and virtual appliances. DataWall 3.3 monitors stored procedures and notifies you of any new product? 
unauthorized changes in their content. It provides both scheduled and real-time reporting of the content, owner- Send announce- 
ship, and history of the stored procedures in your SQL Server environment. DataWall 3.3 also offers enhanced ments to products @ 
user role auditing, which lets you monitor users’ login histories, as well as the roles, groups, permissions, and status 
assigned to each user. To learn more, visit www.secerno.com. 


sqlmag.com. 
—Jeff James, 


PERFORMANCE MANAGEMENT EanOt amet 
Automate and Simplify Transaction Process Management 

Precise has released Precise 8.5, the vendor’s transaction performance management product. (Precise was previ- 
ously part of Symantec, and its product was previously branded Symantec APM.) Precise 8.5 simplifies adminis- 
tration by letting you track transactions across tiers, detect transaction performance glitches, and then dive deep 
into the offending tier in order to resolve the problem when convenient. The product has been extended to the 
storage tier and notifies you if moving applications around will cause storage problems. It lets you manage the 
transaction performance of composite applications and includes an automated and simplified migration path. 
Also, Precise will transfer existing Symantec APM support contracts to Precise free of charge under the Precise 
APM Migration Program. Precise for MSSQL 8.5 supports SQL Server 2000 SP3 and later. To learn more, visit 
www.precise.com. 


SYSTEMS MANAGEMENT 
Microsoft Releases BizTalk Server 2009 E2E Perspective: Microsoft Logical Overview 
Microsoft has released Microsoft BizTalk Server 2009, the Business Applications Bacineselntel 
enterprise integration and connectivity server solution. The HA A X recess 
company says that BizTalk Server 2009 delivers new capa- 


igence 
B 


bilities to help customers integrate disparate systems, enhance a we so è 
; a : Sa mics 

end-to-end enterprise connectivity, and more quickly exchange y 

data across multiple platforms. With BizTalk Server 2009, cus- RFID and Sensor Services 


Distributed Edge’ Sites Running BizTalk® RFID 


tomers can now utilize the improvements available within the 
latest Microsoft products to boost their integration initiatives 
through enhanced developer productivity, improved system 
performance, and better manageability. The adapters available ajirata 
let organizations quickly use data from mainframes, databases, s : 
third-party applications, and radio frequency identification 
(RFID) devices using familiar management and development Physical Layer 
tools with less complexity and lower costs. BizTalk Server 2009 ; 

is available now to Microsoft Developer Network customers. 
To learn more, visit www.microsoft.com/biztalk. 


T-SQL 

Update Your Twitter Account Using T-SQL 

While Twitter is a great tool, you can’t do much without using a management 
add-on such as TweetDeck or TweetLater. Now, you can use T-SQL to perform 
tasks in Twitter through Tweet-SQL. Tweet-SQL currently offers more than 30 
CLR procedures. You can use Tweet-SQL to update your status on Twitter, follow 
other Twitter users, archive your public timeline and @replies, send direct messages 
to your followers, delete a status, and even clone Twitter apps. You can view Tweet- 
SQL data in three ways: as an XML resultset, a relational resultset, or output parameters. Tweet-SQL works 
with SQL Server 2005 or later, and requires the .NET Framework 3.5 or later. Download a free trial version of 
Tweet-SQL or purchase it for £25 at www.tweet-sql.com. [SQL 
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Fag Using SQL Server 2008's 


FILESTREAM Data Type 


S Server 2008’s FILESTREAM support 
Q combines access to LOBs from the NTFS 
file system with the integrity and easy access of the 
relational database engine. Here’s how to enable and use 
FILESTREAM support: 


Enable FILESTREAM: Servers 

To enable FILESTREAM support during or after 
installation or during an upgrade, use T-SQL and run 
the sp_configure procedure: 


EXEC sp_configure FILESTREAM_ 
access_level, 2; 

GO 

RECONFIGURE; 

GO 


The value of 2 enables FILESTREAM support for 
T-SQL and Win32; 1 enables FILESTREAM support 
for T-SQL; 0 disables FILESTREAM support. 


Enable FILESTREAM: 

Databases 

Use CREATE DATABASE or ALTER DATABASE 
to enable database FILESTREAM support. I used 
ALTER DATABASE to add a file group using the 
CONTAINS FILESTREAM keyword to the North- 
wind database: 


ALTER DATABASE Northwind ADD 
FILEGROUP FILESTREAMGroup 
CONTAINS FILESTREAM; 


Then I used the ADD FILE keywords to tell SQL 
Server what NTFS folder to use: 


ALTER DATABASE Northwind ADD FILE ( 
NAME = FSGroupFile, 
FILENAME = c:\FSDATA') 

TO FILEGROUP FILESTREAMGroup; 


FILESTREAM Data Type: Tables 
After enabling FILESTREAM support in the data- 
base, use T-SQL to create a table containing the 
FILESTREAM data type. Each row can have one or 
more FILESTREAM data types but must include a 
column using the UNIQUEIDENTIFIER data type 
with the ROWGUID attribute. 


USE Northwind; 
CREATE TABLE FILESTREAMTable ( 
FSDatal VARBINARY (MAX) 
FILESTREAM NULL, 
FSGUID UNIQUEIDENTIFIER NOT 
NULL ROWGUIDCOL 
UNIQUE DEFAULT NEWID ()) 


FILESTREAM Data Access: T-SQL 
You can use T-SQL to access FILESTREAM data 
without having to use ADO.NET. Just type this: 


INSERT INTO FILESTREAMTable 
VALUES (newid(), 


CAST ('FSDatal’ As VARBINARY (MAX) )) SQL] 


InstantDoc ID 102068 


YOUR SAVVY Ads! 


LINK tO TF Resources 


BAVC MISsSINg. 
hat kind of math are companies using? The 
first thing they do to save money is cut training 

and travel expenses, but the next thing they do is 
increase your workload—which requires new skills 
for which you need training. At SQL Server Magazine, 
we've done the math—and we want to help you get 
the expert training you need, for no cost at all, with 
two of our web seminars: 

1. “SQL Server Sprawl Stops Here: Alternative 
Thinking About SQL Consolidation.” Find out how you 
can consolidate SQL servers on the scale-up Integ- 
rity architecture, reduce TCO, improve availability, 


STANT 


and reduce system agement while improving 
manageability—all at the same time. 
http://sqlmag.com/go/SQLconsolidation 

2. “End Your Search for Affordable Search.” What 
good is data if users can’t find it? Learn how to imple- 
ment an affordable search solution to locate relevant 
business analysis data quickly from a user-friendly 
interface. http://sqlmag.com/go/AffordableSearch 


These seminars are online and on-demand. And 
they’re free. How’s that for doing the math? 
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Disk latency charts show you exactly how long reads and writes are taking for each 
disk and each database file, quickly highlighting disk bottlenecks. 
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Performance Dashboard with relevant SQL Server + Top SQL analysis highlights heaviest queries 


and Windows metrics 


e Graphical blocking and deadlock analysis 


Real-time and historical performance analysis ` 


Calendar views of Top SQL, blocks and deadlocks 


Disk activity, latency, and capacity monitoring e One-click and automated tracing with Quick Trace™ 


Free Trial Download: sqlsentry.net/sql-performance 
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Run a command to restart it 
s ance ANTY ..automatically! 
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Download it now: 


Idera.com/SQLDM6 


